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^Jj (54) Title: NOVEL PROTEINS AND NUCLEIC ACIDS ENCODING SAME 

®^ (57) Abstract: The present invention provides novel isolated polynucleotides and small molecule target polypeptides encoded by 
the polynucleotides. Antibodies that immunospecifically bind to a novel small molecule target polypeptide or any derivative, variant, 

^ mutant or fragment of that polypeptide, polynucleotide or antibody are disclosed, as are methods in which the small molecule target 

S polypeptide, polynucleotide and antibody are utilized in the detection and treatment of a broad range of pathological states. More 
. ' speficically, the present invention discloses methods of using recombinantly expressed and/or endogenously expressed proteins in 

Q various screening procedures for the purpose of identifying therapeutic antibodies and therapeutic small molecules associated with 
diseases. The invention further discloses therapeutic, diagnostic and research methods for diagnosis, treatment, and prevention of 

^ disorders involving any one of these novel human nucleic acids and proteins. 
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NOVEL PROTEINS AND NUCLEIC ACIDS ENCODING SAME 

FIELD OF THE INVENTION 

The present invention relates to novel polypeptides that are targets of small 
molecule drugs and that have properties related to stimulation of biochemical or 
physiological responses in a cell, a tissue, an organ or an organism. More particularly, the 
novel polypeptides are gene products of novel genes, or are specified biologically active 
fragments or derivatives thereof. Methods of use encompass diagnostic and prognostic 
assay procedures as well as methods of treating diverse pathological conditions. 



1 



WO 03/029424 



PCT/US02/31373 



BACKGROUND 

Eukaryotic cells are characterized by biochemical and physiological processes 
which under normal conditions are exquisitely balanced to achieve the preservation and 
propagation of the cells. When such cells are components of multicellular organisms such 
5 as vertebrates, or more particularly organisms such as mammals, the regulation of the 

biochemical and physiological processes involves intricate signaling pathways. Frequently, 
such signaling pathways involve extracellular signaling proteins, cellular receptors that 
bind the signaling proteins and signal transducing components located within the cells. 

Signaling proteins may be classified as endocrine effectors, paracrine effectors or 

10 autocrine effectors. Endocrine effectors are signaling molecules secreted by a given organ 
into the circulatory system, which are then transported to a distant target organ or tissue. 
The target cells include the receptors for the endocrine effector, and when the endocrine 
effector binds, a signaling cascade is induced. Paracrine effectors involve secreting cells 
and receptor cells in close proximity to each other, for example two different classes of 

15 cells in the same tissue or organ. One class of cells secretes the paracrine effector, which 
then reaches the second class of cells, for example by diffusion through the extracellular 
fluid The second class of cells contains the receptors for the paracrine effector; binding of 
the effector results in induction of the signaling cascade that elicits the corresponding 
biochemical or physiological effect. Autocrine effectors are highly analogous to paracrine 

20 effectors, except that the same cell type that secretes the autocrine effector also contains the 
receptor. Thus the autocrine effector binds to receptors on the same cell, or on identical 
neighboring cells. The binding process then elicits the characteristic biochemical or 
physiological effect. 

Signaling processes may elicit a variety of effects on cells and tissues including by 
25 way of nonlimiting example induction of cell or tissue proliferation, suppression of growth 
or proliferation, induction of differentiation or maturation of a cell or tissue, and 
suppression of differentiation or maturation of a cell or tissue. 

Many pathological conditions involve dysregulation of expression of important 
effector proteins. In certain classes of pathologies the dysregulation is manifested as 
30 diminished or suppressed level of synthesis and secretion of protein effectors. In other 
classes of pathologies the dysregulation is manifested as increased or up-regulated level of 
synthesis and secretion of protein effectors. In a clinical setting a subject may be suspected 
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1 of suffering from a condition brought on by altered or ml^fegiSflafefi4e?M^f ^p?6t€iri^ 
effector of interest. Therefore there is a need to assay for the level of the protein effector 
of interest in a biological sample from such a subject, and to compare the level with that 
characteristic of a nonpathological condition. There also is a need to provide the protein 
5 effector as a product of manufacture. Administration of the effector to a subject in need 
thereof is useful in treatment of the pathological condition. Accordingly, there is a need for 
a method of treatment of a pathological condition brought on by a diminished or suppressed 
levels of the protein effector of interest. In addition, there is a need for a method of 
treatment of a pathological condition brought on by a increased or up-regulated levels of 

10 the protein effector of interest. 

Small molecule targets have been implicated in various disease states or 
pathologies. These targets may be proteins, and particularly enzymatic proteins, which are 
acted upon by small molecule drugs for the purpose of altering target function and 
achieving a desired result. Cellular, animal and clinical studies can be performed to 

15 elucidate the genetic contribution to the etiology and pathogenesis of conditions in which 
small molecule targets are implicated in a variety of physiologic, pharmacologic or native 
states. These studies utilize the core technologies at CuraGen Corporation to look at 
differential gene expression, protein-protein interactions, large-scale sequencing of 
expressed genes and the association of genetic variations such as, but not limited to, single 

20 nucleotide polymorphisms (SNPs) or splice variants in and between biological samples 
from experimental and control groups. The goal of such studies is to identify potential 
avenues for therapeutic intervention in order to prevent, treat the consequences or cure the 
conditions. 

In order to treat diseases, pathologies and other abnormal states or conditions in 
25 which a mammalian organism has been diagnosed as being, or as being at risk for 
becoming, other than in a normal state or condition, it is important to identify new 
therapeutic agents. Such a procedure includes at least the steps of identifying a target 
component within an affected tissue or organ, and identifying a candidate therapeutic agent 
that modulates the functional attributes of the target. The target component may be any 
30 biological macromolecule implicated in the disease or pathology. Commonly the target is a 
polypeptide or protein with specific functional attributes. Other classes of macromolecule 
may be a nucleic acid, a polysaccharide, a lipid such as a complex lipid or a glycolipid; in 
addition a target may be a sub-cellular structure or extra-cellular structure that is comprised 
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of more than one of these classes of macromolecule. Ori&e%udh •£ ^kfgSt'W'afFbeei? 
identified, it may be employed in a screening assay in order to identify favorable candidate 
therapeutic agents from among a large population of substances or compounds. 

In many cases the objective of such screening assays is to identify small molecule 
5 candidates; this is commonly approached by the use of combinatorial methodologies to 
develop the population of substances to be tested. The implementation of high throughput 
screening methodologies is advantageous when working with large, combinatorial libraries 
of compounds. 

SUMMARY OF THE INVENTION 

10 The invention includes nucleic acid sequences and the novel polypeptides they 

encode. The novel nucleic acids and polypeptides are referred to herein as NOVX, or 
NOV1, NOV2, NOV3, etc., nucleic acids and polypeptides. These nucleic acids and 
polypeptides, as well as derivatives, homologs, analogs and fragments thereof, will 
hereinafter be collectively designated as "NOVX" nucleic acid, which represents the 

15 nucleotide sequence selected from the group consisting of SEQ ID NO: 2n-l, wherein n is 
an integer between 1 and 124, or polypeptide sequences, which represents the group 
consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 124. 

In one aspect, the invention provides an isolated polypeptide comprising a mature 
form of a NOVX amino acid. One example is a variant of a mature form of a NOVX 

20 amino acid sequence, wherein any amino acid in the mature form is changed to a different 
amino acid, provided that no more than 15% of the amino acid residues in the sequence of 
the mature form are so changed. The amino acid can be, for example, a NOVX amino acid 
sequence or a variant of a NOVX amino acid sequence, wherein any amino acid specified 
in the chosen sequence is changed to a different amino acid, provided that no more than 

25 15% of the amino acid residues in the sequence are so changed. The invention also 
includes fragments of any of these. In another aspect, the invention also includes an 
isolated nucleic acid that encodes a NOVX polypeptide, or a fragment, homolog, analog or 
derivative thereof. 

Also included in the invention is a NOVX polypeptide that is a naturally occurring 
30 allelic variant of a NOVX sequence. In one embodiment, the allelic variant includes an 
amino acid sequence that is the translation of a nucleic acid sequence differing by a single 
nucleotide from a NOVX nucleic acid sequence. In another embodiment, the NOVX 
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polypeptide is a variant polypeptide described therein, wffeifem'kriy ^iftS'^idsptfcffiedfii 
the chosen sequence is changed to provide a conservative substitution. In one embodiment, 
the invention discloses a method for determining the presence or amount of the NO VX 
polypeptide in a sample. The method involves the steps of: providing a sample; 
5 introducing the sample to an antibody that binds immunospecifically to the polypeptide; 
and determining the presence or amount of antibody bound to the NOVX polypeptide, 
thereby determining the presence or amount of the NOVX polypeptide in the sample. In 
another embodiment, the invention provides a method for determining the presence of or 
predisposition to a disease associated with altered levels of a NOVX polypeptide in a 

10 mammalian subject. This method involves the steps of: measuring the level of expression 
of the polypeptide in a sample from the first mammalian subject; and comparing the 
amount of the polypeptide in the sample of the first step to the amount of the polypeptide 
present in a control sample from a second mammalian subject known not to have, or not to 
be predisposed to, the disease, wherein an alteration in the expression level of the 

15 polypeptide in the first subject as compared to the control sample indicates the presence of 
or predisposition to the disease. 

In a further embodiment, the invention includes a method of identifying an agent 
that binds to a NOVX polypeptide. This method involves the steps of: introducing the 
polypeptide to the agent; and determining whether the agent binds to the polypeptide. In 

20 various embodiments, the agent is a cellular receptor or a downstream effector. 

In another aspect, the invention provides a method for identifying a potential 
therapeutic agent for use in treatment of a pathology, wherein the pathology is related to 
aberrant expression or aberrant physiological interactions of a NOVX polypeptide. The 
method involves the steps of: providing a cell expressing the NOVX polypeptide and 

25 having a property or function ascribable to the polypeptide; contacting the cell with a 
composition comprising a candidate substance; and determining whether the substance 
alters the property or function ascribable to the polypeptide; whereby, if an alteration 
observed in the presence of the substance is not observed when the cell is contacted with a 
composition devoid of the substance, the substance is identified as a potential therapeutic 

30 agent. In another aspect, the invention describes a method for screening for a modulator of 
activity or of latency or predisposition to a pathology associated with the NOVX 
polypeptide. This method involves the following steps: administering a test compound to a 
test animal at increased risk for a pathology associated with the NOVX polypeptide, 
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wherein the test animal recombinantly expresses the NO^i^po^yjSeP^ 
involves the steps of measuring the activity of the NOVX polypeptide in the test animal 
after administering the compound of step; and comparing the activity of the protein in the 
test animal with the activity of the NOVX polypeptide in a control animal not administered 
the polypeptide, wherein a change in the activity of the NOVX polypeptide in the test 
animal relative to the control animal indicates the test compound is a modulator of latency 
of, or predisposition to, a pathology associated with the NOVX polypeptide. In one 
embodiment, the test animal is a recombinant test animal that expresses a test protein 
transgene or expresses the transgene under the control of a promoter at an increased level 
relative to a wild-type test animal, and wherein the promoter is not the native gene 
promoter of the transgene. Li another aspect, the invention includes a method for 
modulating the activity of the'NOVX polypeptide, the method comprising introducing a 
cell sample expressing the NOVX polypeptide with a compound that binds to the 
polypeptide in an amount sufficient to modulate the activity of the polypeptide. 

The invention also includes an isolated nucleic acid that encodes a NOVX 
polypeptide, or a fragment, homolog, analog or derivative thereof. In a preferred 
embodiment, the nucleic acid molecule comprises the nucleotide sequence of a naturally 
occurring allelic nucleic acid variant. In another embodiment, the nucleic acid encodes a 
variant polypeptide, wherein the variant polypeptide has the polypeptide sequence of a 
naturally occurring polypeptide variant. In another embodiment, the nucleic acid molecule 
differs by a single nucleotide from a NOVX nucleic acid sequence. In one embodiment, 
the NOVX nucleic acid molecule hybridizes under stringent conditions to the nucleotide 
sequence selected from the group consisting of SEQ ID NO: 2n-l, wherein n is an integer 
between 1 and 124, or a complement of the nucleotide sequence. In another aspect, the 
invention provides a vector or a cell expressing a NOVX nucleotide sequence. 

In one embodiment, the invention discloses a method for modulating the activity of 
a NOVX polypeptide. The method includes the steps of: introducing a cell sample 
expressing the NOVX polypeptide with a compound that binds to the polypeptide in an 
amount sufficient to modulate the activity of the polypeptide. In another embodiment, the 
invention includes an isolated NOVX nucleic acid molecule comprising a nucleic acid 
sequence encoding a polypeptide comprising a NOVX amino acid sequence or a variant of 
a mature form of the NOVX amino acid sequence, wherein any amino acid in the mature 
form of the chosen sequence is changed to a different amino acid, provided that no more 
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than 15% of the amino acid residues in the sequence of tftefaa&fe M5S ItfeFstf chafed? 
another embodiment, the invention includes an amino acid sequence that is a variant of the 
NOVX amino acid sequence, in which any amino acid specified in the chosen sequence is 
changed to a different amino acid, provided that no more than 15% of the amino acid 
5 residues in the sequence are so changed. 

In one embodiment, the invention discloses a NOVX nucleic acid fragment 
encoding at least a portion of a NOVX polypeptide or any variant of the polypeptide* 
wherein any amino acid of the chosen sequence is changed to a different amino acid, 
provided that no more than 10% of the amino acid residues in the sequence are so changed. 

10 In another embodiment, the invention includes the complement of any of the NOVX 
nucleic acid molecules or a naturally occurring allelic nucleic acid variant. In another 
• embodiment, the invention discloses a NOVX nucleic acid molecule that encodes a variant 
polypeptide, wherein the variant polypeptide has the polypeptide sequence of a naturally 
occurring polypeptide variant In another embodiment, the invention discloses a NOVX 

15 nucleic acid, wherein the nucleic acid molecule differs by a single nucleotide from a 
NOVX nucleic acid sequence. 

In another aspect, the invention includes a NOVX nucleic acid, wherein one or 
more nucleotides in the NOVX nucleotide sequence is changed to a different nucleotide 
provided that no more than 15% of the nucleotides are so changed. In one embodiment, the 

20 invention discloses a nucleic acid fragment of the NOVX nucleotide sequence and a 

nucleic acid fragment wherein one or more nucleotides in the NOVX nucleotide sequence 
is changed from that selected from the group consisting of the chosen sequence to a 
different nucleotide provided that no more than 15% of the nucleotides are so changed. In 
another embodiment, the invention includes a nucleic acid molecule wherein the nucleic 

25 acid molecule hybridizes under stringent conditions to a NOVX nucleotide sequence or a 
complement of the NOVX nucleotide sequence. In one embodiment, the invention 
includes a nucleic acid molecule, wherein the sequence is changed such that no more than 
15% of the nucleotides in the coding sequence differ from the NOVX nucleotide sequence 
or a fragment thereof. 

30 In a further aspect, the invention includes a method for determining the presence or 

amount of the NOVX nucleic acidin a sample. The method involves the steps of: 
providing the sample; introducing the sample to a probe that binds to the nucleic acid 
molecule; and determining the presence or amount of the probe bound to the NOVX 
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nucleic acid molecule, thereby determining the presence 6f 'km&ufit'b^H^MBN^^tifclBfc 
acid molecule in the sample. In one embodiment, the presence or amount of the nucleic 
acid molecule is used as a marker for cell or tissue type. 

In another aspect, the invention discloses a method for determining the presence of 
5 or predisposition to a disease associated with altered levels of the NOVX nucleic acid 

molecule of in a first mammalian subject. The method involves the steps of: measuring the 
amount of NOVX nucleic acid in a sample from the first mammalian subject; and 
comparing the amount of the nucleic acid in the sample of step (a) to the amount of NOVX 
nucleic acid present in a control sample from a second mammalian subject known not to 

10 have or not be predisposed to, the disease; wherein an alteration in the level of the nucleic 
acid in the first subject as compared to the control sample indicates the presence of or 
predisposition to the disease. 

Unless otherwise defined, all technical and scientific terms used herein have the 
same meaning as commonly understood by one of ordinary skill in the art to which this 

15 invention belongs. Although methods and materials similar or equivalent to those 

described herein can be used in the practice or testing of the present invention, suitable 
methods and materials are described below. All publications, patent applications, patents, 
and other references mentioned herein are incorporated by reference in their entirety. In 
the case of conflict, the present specification, including definitions, will control. In 

20 addition, the materials, methods, and examples are illustrative only and not intended to be 
limiting. 

Other features and advantages of the invention will be apparent from the following 
detailed description and claims. 

DETAILED DESCRIPTION OF THE INVENTION 

25 The present invention provides novel nucleotides and polypeptides encoded 

thereby. Included in the invention are the novel nucleic acid sequences, their encoded 
polypeptides, antibodies, and other related compounds. The sequences are collectively 
referred to herein as "NOVX nucleic acids" or "NOVX polynucleotides" and the 
corresponding encoded polypeptides are referred to as "NOVX polypeptides" or "NOVX 

30 proteins." Unless indicated otherwise, "NOVX" is meant to refer to any of the novel 

sequences disclosed herein. Table A provides a summary of the NOVX nucleic acids and 
their encoded polypeptides. 
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NOVX 
Assignment 


Internal 
Identification 


SEQID 
NO 

(nucleic 
acid) 


SEQID 
NO 

(amino 
acid) 


Homology 


la 


CG106764-01 


1 


2 


Citron Kinase 


lb 


268667493 


3 


A 

4 


RHO/RAC-Interacting Citron Kinase 


1c 


268667539 


5 


6 


RHO/RAC-Interacting Citron Kinase 


Id 


268667543 


7 


8 


RHO/RAC-Interacting Citron Kinase 


le 


268667555 


9 


10 


RHO/RAC-Interachng Qtron Kinase 


If 


268667574 


11 


12 


RHO/RAC-Interacting Citron Kinase 


If? 


CG106764-02 


13 


14 


RHO/RAC-Interacting Citron Kinase 


2a 


CGI 17662-01 


15 


16 


Renal Renin Precursor 


2b 


CGI 17662-02 


17 


18 


Renal Renin Precursor 


3a 


CGI 1805 1-01 


19 


20 


Aldehyde Dehydrogenase 


3b 


CGI 18051-02 


21 


22 


Aldehyde Dehydrogenase 


3c 


CGI 1805 1-03 


23 


24 


Aldehyde Dehydrogenase 


4a 


CG120277-01 


25 


26 


Aldehyde Dehydrogenase-3 


4b 


CG120277-02 


27 


28 


Aldehyde Dehydrogenase-3 


5a 


CG140468-01 


29 


30 


Serine/Threonine-Protein Kinase PAK 1 


5b 


CG140468-02 


31 


32 


Serine/Threonine-Protein Kinase PAK 1 


6a 


CG142182-01 


33 


34 


Ubiquitin Carboxyl-teiminal Hydrolase 
15 


7a 


CG142564-01 


35 


36 


Carnitine O-Palmitoyltransferase I 


8a 


CG142797-01 


37 


38 


Cathepsin L 


9a 


CG143216-01 


39 


40 


Lamdnin Gamma 3 Chain Precursor 


10a 


CG143787-01 


41 


42 


Disintegrin Protease 


10b 


278889162 


43 


44 


Disintegrin Protease 


10c 


278689868 


45 


46 


Disintegrin Protease 


11a 


CG1441 12-01 


47 


48 


NEUROPSIN PRECURSOR like homo 
sapiens 


lib 


CG144 112-04 


49 


50 


Neuropsin Precursor 


11c 


255501898 


51 


52 


Neuropsin Precursor 


lid 


255612524 


53 


54 


Neuropsin Precursor 


lie 


255612566 


55 


56 


Neuropsin Precursor 


llf 


306434072 


57 


58 


Neuropsin Precursor 


llg 


CG 144 112-02 


59 j 


60 


Neuropsin Precursor 


llh 


CG1441 12-03 


61 


62 


Neuropsin Precursor 


12a 


CG144497-01 


63 


64 


A J 1 • x. CI _ ■ Ik AT 1 

Adenylosuccinate Synthetase Muscle 
Isozyme 


13a 


CG144686-01 


65 


66 


Mast Cell Carboxypeptidase A Precursor 


13b 


278690008 


67 


68 


Mast Cell Carboxypeptidase A Precursor 


13c 


278690035 


69 


70 


Mast Cell Carboxypeptidase A Precursor 


13d 


CG144686-02 


71 


72 


Mast Cell Carboxypeptidase A Precursor 


14a 


CG144906-01 


73 


74 


Testisin Precursor 


14b 


CG144906-02 


75 


76 


Testisin Precursor 


15a 


CG144997-01 


77 


78 


RNaseHI 


15b 


278693648 


79 


80 


RNaseHI 


15c 


278480974 


81 


82 


RNaseHI 


15d 


278498047 


83 


84 


RNaseHI 
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15e 


CG144997-02 


85 


86 




16a 


CG145494-01 


87 


88 


PRESTIN 


17a 


CG145722-01 


89 


90 


WEE1 


18a 


CG145754-01 


91 


92 


Kallikrein 7 Precursor 


18b 


CG145754-03 


93 


94 


Kallikrein 7 Precursor 


18c 


CG145754-02 


95 


96 


Kallikrein 7 Precursor 


18d 


252718128 


97 


98 


Kallikrein 


18e 


252718152 


99 


100 


Kallikrein 


18f 


247856668 


101 


102 


Kallikrein 7 Precursor 


18g 


247856705 


103 


104 


Kallikrein 7 Precursor 


19a 


CG146279-01 


ins 


106 


Novel Potassium Channel Subfamily K 








Member 10 (TREK-2) 


20a 


CG146374-01 


107 


108 


Glycogen Branching Enzyme 


21a 


CG146403-01 


. 109 


110 


Diacylglycerol Acyltransferase 2 


22a 


CG146513-01 


111 


112 


Diacylglycerol Acyltransferase 2 


23a 


CG146522-01 


113 


114 


Diacylglycerol Acyltransferase 2 


24a 


CG146531-01 


115 


116 


Diacylglycerol Acyltransferase 2 


25a 


CG147274-01 


117 


118 


Protease 


26a 


CG147351-01 


119 


120 


Testis-Development Related NYD-SP27 


27a 


CG147419-01 


191 


122 


Glutamine:Fructose-6-Phosphate 
Amidotransferase 1 Muscle Isoform 


28a 


CG148102-01 


123 


124 


Carnitine O-Palmitoyltransferase 


28b 


CG148 102-02 


125 


126 


Carnitine O-Palmitoyltransferase 


29a 


CG148431-01 


127 


128 


Class II Aminotransferase 


29b 


CG148431-02 


129 


130 


Class II Aminotransferase 


30a 


CG148888-01 


131 


132 


GALNAC 4-Sulfotransferase 


31a 


CG149008-01 


133 


134 


Sodium/Hydrogen Exchanger 


32a 


CG149350-01 


135 


136 


Vacuolar ATP Synthase Subunit F 


32b 


CG149350-O2 


137 


138 


Vacuolar ATP Synthase Subunit F 


33a 


CG149463-01 


139 j 


140 


Serine/Threonine-Protein Kinase SGK 


34a 


CG149536-01 


J.H- 1 


142 


Protein-Tyrosine Phosphatase, 








Non-Receptor Type 2 


35a 


CG149964-01 


143 


144 


Brain Mitochondrial Carrier Protein-1 


35b 


309326356 


145 


146 


Brain Mitochondrial Carrier Protein-1 


35c 


309326444 


147 


148 


Brain Mitochondrial Carrier Protein-1 


35d 


309326473 


149 


150 


Brain Mitochondrial Carrier Protein-1 


35e 


CG149964-02 


151 


152 


Brain Mitochondrial Carrier Protein-1 


36a 


CG150306-01 


153 


154 


Dual Specificity Protein Phosphatase 5 


37a 


CG1505 10-01 


155 


156 


Human Alpha-2,3-Sialyltransferase 


38a 


CG150704-01 


l J l 


158 


Testis ecto-ADP-Ribosyltransferase 
Precursor 


39a 


CG150799-01 


159 


160 


MASS1 


39b 


CG150799-02 


161 


162 


MASS1 


39c 


CG150799-03 


163 


164 


MASS1 


39d 


CG150799-01 


165 


166 


MASS1 


40a 


CG15 1014-01 


167 


168 


Metabotropic Glutamate Receptor 3 


40b 


CG151014-02 


169 


170 


Metabotropic Glutamate Receptor 3 


40c 


CG151014-03 


171 


172 


Metabotropic Glutamate Receptor 3 


41a 


CG15 1297-01 


173 


174 


Calmodulin-Dependent 








Phosphodiesterase 


41b 


CG151297-02 


175 


176 


Calmodulin-Dependent 








Phosphodiesterase 



10 



WO 03/029424 



PCT/US02/31373 



42a 


CG15 1822-01 


1T7 


178 


Pr^eyIdnJ*aT t lM^ / " ' ' " 








Methyltransferase 


42b ' 


CG151822-02 




180 


Prenylcysteine Carboxyl 
Methyltransferase 


43a 


CG152256-01 


181 


182 


Phosphatidylserine Synthase 


44a 


CG171804-01 


183 


184 


N-Acetylgalactosaminide Alpha 2, 
6-Sialyltransferase 


45a 


CG171841-01 


185 


186 


Iron-Containing Alcohol Dehydrogenase 


46a 


CG173017-01 


187 


188 


Retinoic Acid Receptor RXR-B eta 


47a 


CG173347-01 


189 


190 


Serum Paraoxonase/Arylesterase 3 


48a 


CG56234-01 


101 


192 


Phosphoenolpyruvate Carboxykinase 2 
(PCK2) 


48b 


CG56234-02 


19"} 


194 


Phosphoenolpyruvate Carboxykinase 2 
(PCK2) 


49a 


CG56836-01 


195 


196 


Cathepsin B 


49b 


CG56836-02 


197 


198 


Cathepsin B 


49c 


CG56836-03 


199 


200 


Cathepsin B 


49d 


CG56836-04 


201 


202 


Cathepsin B 


49e 


247856403 


203 


204 


Cathepsin B 


49f 


247856434 


205 


206 


Cathepsin B 


49g 


247856497 


207 


208 


Cathepsin B 


49h 


247856493 


209 


210 


Cathepsin B 


49i 


247856574 


211 


212 


Cathepsin B 


49j 


247856545 


213 


214 


Cathepsin B 


49k 


275480714 


215 


216 


Cathepsin B 


50a 


CG57284-01 


217 


218 


RAS-Related Protein RAB-5C 


50b 


CG57284-03 


219 


220 


RAS-Related Protein RAB-5C 


50c 


CG57284-02 


221 


222 


RAS-Related Protein RAB-5C 


51a 


CG57308-01 


223 


224 


Sulfonylurea Receptor 1 


51b 


CG57308-02 


225 


226 


Sulfonylurea Receptor 1 


52a 


CG93659-01 


22/ , 


228 


Mitogen-Activated Protein Kinase 






Kinase Kinase 8 


52b 


CG93659-03 


ion 




Mitogen-Activated Protein Kinase 






Kinase Kinase 8 


■52c 


CG93659-02 






Mitogen-Activated Protein Kinase 
Kinase Kinase 8 


53a 


CG94521-01 


ADD 




Cytoplasmic Glycerol-3-Phosphate 
Dehydrogenase [NAD+] 


53b 


CG94521-03 






Cytoplasmic GIycerol-3-Phosphate 
Dehydrogenase [NAD+] 


53c 


CG94521-02 


237 


238 


Cytoplasmic Glycerol-3-Phosphate 
Dehydrogenase [NAD+*| 


54a 


CG96613-01 


239 


240 


Pyruvate Dehydrogenase Kinase (PDK1) 


54b 


CG96613-03 


241 


242 


Pyruvate Dehydrogenase Kinase (PDK1) 


54c 


CG96613-02 


243 


244 


Pyruvate Dehydrogenase Kinase (PDK1) 


55a 


CG96736-01 


245 


246 


Neutral Amino Acid Transporter B 


55b 


CG96736-02 


247 


248 


Neutral Amino Acid Transporter B 



Table A indicates the homology of NOVX polypeptides to known protein families. 
Thus, the nucleic acids and polypeptides, antibodies and related compounds according to 
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the invention corresponding to a NOVX as identified in dolfifhrl f "ofTabM !^^ilft?? , U§efur u 
in therapeutic and diagnostic applications implicated in, for example, pathologies and 
disorders associated with the known protein families identified in column 5 of Table A. 

Pathologies, diseases, disorders and condition and the like that are associated with 
NOVX sequences include, but are not limited to: e.g., cardiomyopathy, atherosclerosis, 
hypertension, congenital heart defects, aortic stenosis, atrial septal defect (ASD), 
atrioventricular (A-V) canal defect, ductus arteriosus, pulmonary stenosis, subaortic 
stenosis, ventricular septal defect (VSD), valve diseases, tuberous sclerosis, scleroderma, 
obesity, metabolic disturbances associated with obesity, transplantation, 
adrenoleukodystrophy, congenital adrenal hyperplasia, prostate cancer, diabetes, metabolic 
disorders, neoplasm; adenocarcinoma, lymphoma, uterus cancer, fertility, hemophilia, 
hypercoagulation, idiopathic thrombocytopenic purpura, immunodeficiencies, graft versus 
host disease, AIDS, bronchial asthma, Crohn f s disease; multiple sclerosis, treatment of 
Albright Hereditary Ostoeodystrophy, infectious disease, anorexia, cancer-associated 
cachexia, cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, 
immune disorders, hematopoietic disorders, and the various dyslipidemias, the metabolic 
syndrome X and wasting disorders associated with chronic diseases and various cancers, as 
well as conditions such as transplantation and fertility. 

NOVX nucleic acids and their encoded polypeptides are useful in a variety of 
applications and contexts. The various NOVX nucleic acids and polypeptides according to 
the invention are useful as novel members of the protein families according to the presence 
of domains and sequence relatedness to previously described proteins. Additionally, 
NOVX nucleic acids and polypeptides can also be used to identify proteins that are 
members of the family to which the NOVX polypeptides belong. . 

Consistent with other known members of the family of proteins, identified in 
column 5 of Table A, the NOVX polypeptides of the present invention show homology to, 
and contain domains that are characteristic of, other members of such protein families. 
Details of the sequence relatedness and domain analysis for each NOVX are presented in 
Example A. 

The NOVX nucleic acids and polypeptides can also be used to screen for molecules, 
which inhibit or enhance NOVX activity or function. Specifically, the nucleic acids and 
polypeptides according to the invention may be used as targets for the identification of 
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small molecules that modulate or inhibit diseases associ^te&^fttfW listed 
in Table A. 

The NOVX nucleic acids and polypeptides are also useful for detecting specific cell 
types. Details of the expression analysis for each NOVX are presented in Example C. 
5 Accordingly, the NOVX nucleic acids, polypeptides, antibodies and related compounds 
according to the invention will have diagnostic and therapeutic applications in the detection 
of a variety of diseases with differential expression in normal vs. diseased tissues, e.g. 
detection of a variety of cancers. 

Additional utilities for NOVX nucleic acids and polypeptides according to the 
10 invention are disclosed herein. 

NOVX clones 

NOVX nucleic acids and their encoded polypeptides are useful in a variety of 
applications and contexts. The various NOVX nucleic acids and polypeptides according to 
the invention are useful as novel members of the protein families according to the presence 

15 of domains and sequence relatedness to previously described proteins. Additionally, 
NOVX nucleic acids and polypeptides can also be used to identify proteins that are 
members of the family to which the NOVX polypeptides belong. 

The NOVX genes and their corresponding encoded proteins are useful for 
preventing, treating or ameliorating medical conditions, e.g., by protein or gene therapy. 

20 Pathological conditions can be diagnosed by determining the amount of the new protein in 
a sample or by determining the presence of mutations in the new genes. Specific uses are 
described fon each of the NOVX genes, based on the tissues in which they are most highly 
expressed. Uses include developing products for the diagnosis or treatment of a variety of 
diseases and disorders. 

25 The NOVX nucleic acids and proteins of the invention are useful in potential 

diagnostic and therapeutic applications and as a research tool. These include serving as a 
specific or selective nucleic acid or protein diagnostic and/or prognostic marker, wherein 
the presence or amount of the nucleic acid or the protein are to be assessed, as well as 
potential therapeutic applications such as the following: (i) a protein therapeutic, (ii) a 

30 small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 

targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) a 
biological defense weapon. 
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In one specific embodiment, the invention includes iff iSolatM^dl^e^tid^ 
comprising an amino acid sequence selected from the group consisting of: (a) a mature 
form of the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, 
wherein n is an integer between 1 and 124; (b) a variant of a mature form of the amino acid 
5 sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an integer 
between 1 and 124, wherein any amino acid in the mature form is changed to a different 
amino acid, provided that no more than 15% of the amino acid residues in the sequence of 
the mature form are so changed; (c) an amino acid sequence selected from the group 
consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 124; (d) a variant of 
10 the amino acid sequence selected from the group consisting of SEQ ID NO:2n, wherein n is 
an integer between 1 and 124 wherein any amino acid specified in the chosen sequence is 
changed to a different amino acid, provided that no more than 15% of the amino acid 
residues in the sequence are so changed; and (e) a fragment of any of (a) through (d). 
In another specific embodiment, the invention includes an isolated nucleic acid 
15 molecule comprising a nucleic acid sequence encoding a pplypeptide comprising an amino 
acid sequence selected from the group consisting of: (a) a mature form of the amino acid 
sequence given SEQ ID NO: 2n, wherein n is an integer between 1 and 124; (b) a variant of 
a mature form of the amino acid sequence selected from the group consisting of SEQ ID 
NO: 2n, wherein n is an integer between 1 and 124 wherein any amino acid in the mature 
20 form of the chosen sequence is changed to a different amino acid, provided that no more 
than 15% of the amino acid residues in the sequence of the mature form are so changed; (c) 
the amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n 
is an integer between 1 and 124; (d) a variant of the amino acid sequence selected from the 
group consisting of SEQ ID NO: 2n, wherein n is an integer between 1 and 124, in which 
25 any amino acid specified in the chosen sequence is changed to a different amino acid, 

provided that no more than 15% of the amino acid residues in the sequence are so changed; 
(e) a nucleic acid fragment encoding at least a portion of a polypeptide comprising the 
amino acid sequence selected from the group consisting of SEQ ID NO: 2n, wherein n is an 
integer between 1 and 124 or any variant of said polypeptide wherein any amino acid of the 
30 chosen sequence is changed to a different amino acid, provided that no more than 10% of 
the amino acid residues in the sequence are so changed; and (f)the complement of any of 
said nucleic acid molecules. 
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In yet another specific embodiment, the inventioi? myites M lsblatea'nuclefe acft 
molecule, wherein said nucleic acid molecule comprises a nucleotide sequence selected 
from the group consisting of: (a) the nucleotide sequence selected from the group 
consisting of SEQ ID NO: 2n-l, wherein n is an integer between 1 and 124; (b) a 
5 nucleotide sequence wherein one or more nucleotides in the nucleotide sequence selected 
from the group consisting of SEQ ID NO: 2n-l, wherein n is an integer between 1 and 124 
is changed from that selected from the group consisting of the chosen sequence to a 
different nucleotide provided that no more than 15% of the nucleotides are so changed; 
(c) a nucleic acid fragment of the sequence selected from the group consisting of SEQ ID 
10 NO: 2n-l, wherein n is an integer between 1 and 124; and (d) a nucleic acid fragment 
wherein one or more nucleotides in the nucleotide sequence selected from the group 
consisting of SEQ ID NO: 2n-l, wherein n is an integer between 1 and 124 is changed 
from that selected from the group consisting of the chosen sequence to a different 
nucleotide provided that no more than 15% of the nucleotides are so changed. 

15 NOVX Nucleic Acids and Polypeptides 

One aspect of the invention pertains to isolated nucleic acid molecules that encode 
NOVX polypeptides or biologically active portions thereof. Also included in the invention 
are nucleic acid fragments sufficient for use as hybridization probes to identify 
NOVX-encoding nucleic acids (e.g., NOVX mRNAs) and fragments for use as PCR 

20 primers for the amplification and/or mutation of NOVX nucleic acid molecules. As used 
herein, the term "nucleic acid molecule" is intended to include DNA molecules (e.g., 
cDNA or genomic DNA), RNA molecules (e.g., mRNA), analogs of the DNA or RNA 
generated using nucleotide analogs, and derivatives, fragments and homologs thereof . The 
nucleic acid molecule may be single-stranded or double-stranded, but preferably is 

25 comprised double-stranded DNA. 

A NOVX nucleic acid can encode a mature NOVX polypeptide. As used herein, a 
"mature" form of a polypeptide or protein disclosed in the present invention is the product 
of a naturally occurring polypeptide or precursor form or proprotein. . The naturally 
occurring polypeptide, precursor or proprotein includes, by way of nonlimiting example, 

30 the full-length gene product encoded by the corresponding gene. Alternatively, it may be 
defined as the polypeptide, precursor or proprotein encoded by an ORF described herein. 
The product "mature" form arises, by way of nonlimiting example, as a result of one or 
more naturally occurring processing steps that may take place within the cell (e.g., host 
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cell) in which the gene product arises. Examples of such u processing^tepneadingl6^ 
"mature" form of a polypeptide or protein include the cleavage of the N-terminal 
methionine residue encoded by the initiation codon of an ORF, or the proteolytic cleavage 
of a signal peptide or leader sequence. Thus a mature form arising from a precursor 
5 polypeptide or protein that has residues 1 to N, where residue 1 is the N-terminal 

methionine, would have residues 2 through N remaining after removal of the N-terminal 
methionine. Alternatively, a mature form arising from a precursor polypeptide or protein 
having residues 1 to N, in which an N-terminal signal sequence from residue 1 to residue M 
is cleaved, would have the residues from residue M+l to residue N remaining. Further as 

10 used herein, a "mature" form of a polypeptide or protein may arise from a step of 

post-translational modification other than a proteolytic cleavage event. Such additional 
processes include, by way of non-limiting example, glycosylation, myristylation or 
phosphorylation. In general, a mature polypeptide or protein may result from the operation 
of only one of these processes, or a combination of any of them. 

15 The term "probe", as utilized herein, refers to nucleic acid sequences of variable 

length, preferably between at least about 10 nucleotides (nt), about 100 nt, or as many as 
approximately, e.g., 6,000 nt, depending upon the specific use. Probes are used in the 
detection of identical, similar, or complementary nucleic acid sequences. Longer length 
probes are generally obtained from a natural or recombinant source, are highly specific, and 

20 much slower to hybridize than shorter-length oligomer probes. Probes may be single- 
stranded or double-stranded and designed to have specificity in PCR, membrane-based 
hybridization technologies, or ELISA-like technologies. 

The term "isolated" nucleic acid molecule, as used herein, is a nucleic acid that is 
separated from other nucleic acid molecules which are present in the natural source of the 

25 nucleic acid. Preferably, an "isolated" nucleic acid is free of sequences which naturally 

flank the nucleic acid (ie., sequences located at the 5 - and 3 -termini of the nucleic acid) in 
the genomic DNA of the organism from which the nucleic acid is derived. For example, in 
various embodiments, the isolated NOVX nucleic acid molecules can contain less than 
about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb of nucleotide sequences which naturally 

30 flank the nucleic acid molecule in genomic DNA of the cell/tissue from which the nucleic 
acid is derived (e.g., brain, heart, liver, spleen, etc.). Moreover, an "isolated" nucleic acid 
molecule, such as a cDNA molecule, can be substantially free of other cellular material, or 
culture medium, or of chemical precursors or other chemicals. 
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A nucleic acid molecule of the invention, e.g., a rfuclSicf &i3i^ocuTe hawng ffieT 
nucleotide sequence of SEQ ID NO:2tz-1, wherein n is an integer between 1 and 124, or a 
complement of this nucleotide sequence, can be isolated using standard molecular biology 
techniques and the sequence information provided herein. Using all or a portion of the 
5 nucleic acid sequence of SEQ ID NO:2rc-l, wherein n is an integer between 1 and 124, as a 
hybridization probe, NOVX molecules can be isolated using standard hybridization and 
cloning techniques (e.g., as described in Sambrook, et ah, (eds.), Molecular Cloning: A 
Laboratory Manual 2 nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
NY, 1989; and Ausubel, et al 9 (eds.), Current Protocols in Molecular Biology, John 

10 Wiley & Sons, New York, NY, 1993.) 

A nucleic acid of the invention can be amplified using cDNA, mRNA or 
alternatively, genomic DNA, as a template with appropriate oligonucleotide primers 
according to standard PGR amplification techniques. The nucleic acid so amplified can be 
cloned into an appropriate vector and characterized by DNA sequence analysis. 

15 Furthermore, oligonucleotides corresponding to NOVX nucleotide sequences can be 
prepared by standard synthetic techniques, e.g., using an automated DNA synthesizer. 

As used herein, the term "oligonucleotide" refers to a series of linked nucleotide 
residues. A short oligonucleotide sequence may be based on, or designed from, a genomic 
or cDNA sequence and is used to amplify, confirm, or reveal the presence of an identical, 

20 similar or complementary DNA or RNA in a particular cell or tissue. Oligonucleotides 
comprise a nucleic acid sequence having about 10 nt, 50 nt, or 100 nt in length, preferably 
about 15 nt to 30 nt in length. In one embodiment of the invention, an oligonucleotide 
comprising a nucleic acid molecule less than 100 nt in length would further comprise at 
least 6 contiguous nucleotides of SEQ ID NO:2n-l, wherein n is an integer between 1 and 

25 124, or a complement thereof. Oligonucleotides may be chemically synthesized and may 
also be used as probes. 

In another embodiment, an isolated nucleic acid molecule of the invention 
comprises a nucleic acid molecule that is a complement of the nucleotide sequence shown 
in SEQ ED NO:2n-l, wherein n is an integer between 1 and 124, or a portion of this 

30 nucleotide sequence (e.g. , a fragment that can be used as a probe or primer or a fragment 
encoding a biologically-active portion of a NOVX polypeptide). A nucleic acid molecule 
that is complementary to the nucleotide sequence of SEQ ID NO:2n-l, wherein n is an 
integer between 1 and 124, is one that is sufficiently complementary to the nucleotide 
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sequence of SEQ ID NO:2n-l, wherein n is an integer befwggh^ranBW^fiaf irtW 
hydrogen bond with few or no mismatches to the nucleotide sequence shown in SEQ ID 
NO:2n-l, wherein n is an integer between 1 and 124, thereby forming a stable duplex. 
As used herein, the term "complementary" refers to Watson-Crick or Hoogsteen 
5 base pairing between nucleotides units of a nucleic acid molecule, and the term "binding" 
means the physical or chemical interaction between two polypeptides or compounds or 
associated polypeptides or compounds or combinations thereof. Binding includes ionic, 
non-ionic, van der Waals, hydrophobic interactions, and the like. A physical interaction 
can be either direct or indirect. Indirect interactions may be through or due to the effects of 

10 another polypeptide or compound. Direct binding refers to interactions that do not take 
place through, or due to, the effect of another polypeptide or compound, but instead are 
without other substantial chemical intermediates. 

A "fragment" provided herein is defined as a sequence of at least 6 (contiguous) 
nucleic acids or at least 4 (contiguous) amino acids, a length sufficient to allow for specific 

15 hybridization in the case of nucleic acids or for specific recognition of an epitope in the 
case of amino acids, and is at most some portion less than a full length sequence. 
Fragments may be derived from any contiguous portion of a nucleic acid or amino acid 
sequence of choice. 

A full-length NOVX clone is identified as containing an ATG translation start 
20 codon and an in-frame stop codon. Any disclosed NOVX nucleotide sequence lacking an 
ATG start codon therefore encodes a truncated C-terminal fragment of the respective 
NOVX polypeptide, and requires that the corresponding full-length cDNA extend in the 5' 
direction of the disclosed sequence. Any disclosed NOVX nucleotide sequence lacking an 
in-frame stop codon similarly encodes a truncated N-terminal fragment of the respective 
25 NOVX polypeptide, and requires that the corresponding full-length cDNA extend in the 3' 
direction of the disclosed sequence. 

A "derivative" is a nucleic acid sequence or amino acid sequence formed from the 
native compounds either directly, by modification or partial substitution. An "analog" is a 
nucleic acid sequence or amino acid sequence that has a structure similar to, but not 
30 identical to, the native compound, e.g. they differs from it in respect to certain components 
or side chains. Analogs may be synthetic or derived from a different evolutionary origin 
and may have a similar or opposite metabolic activity compared to wild type. A 
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c! homolog" is a nucleic acid sequence or amino acid seqifbnJTd d'f a ^^i£iIT^gene w tIians 
derived from different species. 

Derivatives and analogs may be full length or other than full length. Derivatives or 
analogs of the nucleic acids or proteins of the invention include, but are not limited to, 
5 molecules comprising regions that are substantially homologous to the nucleic acids or 
proteins of the invention, in various embodiments, by at least about 70%, 80%, or 95% 
identity (with a preferred identity of 80-95%) over a nucleic acid or amino acid sequence of 
identical size or when compared to an aligned sequence in which the alignment is done by a 
computer homology program known in the art, or whose encoding nucleic acid is capable 
10 of hybridizing to the complement of a sequence encoding the proteins under stringent, 
moderately stringent, or low stringent conditions. See e.g. Ausubel, et aL 7 Current 
Protocols in Molecular Biology, John Wiley & Sons, New York, NY, 1993, and 
below. 

A "homologous nucleic acid sequence" or "homologous amino acid sequence/* or 

15 variations thereof, refer to sequences characterized by a homology at the nucleotide level or 
amino acid level as discussed above. Homologous nucleotide sequences include those 
sequences coding for isoforms of NOVX polypeptides. Isoforms can be expressed in 
different tissues of the same organism as a result of, for example, alternative splicing of 
RNA. Alternatively, isoforms can be encoded by different genes. In the invention, 

20 homologous nucleotide sequences include nucleotide sequences encoding for a NOVX 
polypeptide of species other than humans, including, but not limited to: vertebrates, and 
thus can include, e.g., frog, mouse, rat, rabbit, dog, cat cow, horse, and other organisms. 
Homologous nucleotide sequences also include, but are not limited to, naturally occurring 
allelic variations and mutations of the nucleotide sequences set forth herein. A homologous 

25 nucleotide sequence does not, however, include the exact nucleotide sequence encoding 
human NOVX protein. Homologous nucleic acid sequences include those nucleic acid 
sequences that encode conservative amino acid substitutions (see below) in SEQ ID 
NO:2n-l, wherein n is an integer between 1 and 124, as well as a polypeptide possessing 
NOVX biological activity. Various biological activities of the NOVX proteins are 

30 described below. 

A NOVX polypeptide is encoded by the open reading frame ("ORF') of a NOVX 
nucleic acid. An ORF corresponds to a nucleotide sequence that could potentially be 
translated into a polypeptide. A stretch of nucleic acids comprising an ORF is 
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uninterrupted by a stop codon. An UKb tnat represents tne cooing sequenceror aran 
protein begins with an ATG "start" codon and terminates with one of the three "stop" 
codons, namely, TAA, TAG, or TGA. For the purposes of this invention, an ORF may be 
any part of a coding sequence, with or without a start codon, a stop codon, or both. For an 
5 ORF to be considered as a good candidate for coding for a bona fide cellular protein, a 
minimum size requirement is often set, e.g., a stretch of DNA that would encode a protein 
of 50 amino acids or more. 

The nucleotide sequences determined from the cloning of the human NOVX genes 
allows for the generation of probes and primers designed for use in identifying and/or 

10 cloning NOVX homologues in other cell types, e.g. from other tissues, as well as NOVX 
homologues from other vertebrates. The probe/primer typically comprises substantially 
purified oligonucleotide. The oligonucleotide typically comprises a region of nucleotide 
sequence that hybridizes under stringent conditions to at least about 12, 25, 50, 100, 150, 
200, 250, 300, 350 or 400 consecutive sense strand nucleotide sequence of SEQ ID 

15 NO:2n-l, wherein n is an integer between 1 and 124; or an anti-sense strand nucleotide 
sequence of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124; or of a naturally 
occurring mutant of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124. 

Probes based on the human NOVX nucleotide sequences can be used to detect 
transcripts or genomic sequences encoding the same or homologous proteins. In various 

20 embodiments, the probe has a detectable label attached, e.g. the label can be a radioisotope, 
a fluorescent compound, an enzyme, or an enzyme co-factor. Such probes can be used as a 
part of a diagnostic test kit for identifying cells or tissues which mis-express a NOVX 
protein, such as by measuring a level of a NOVX-encoding nucleic acid in a sample of cells 
from a subject e.g., detecting NOVX mRNA levels or determining whether a genomic 

25 NOVX gene has been mutated or deleted. 

"A polypeptide having a biologically-active portion of a NOVX polypeptide" refers 
to polypeptides exhibiting activity similar, but not necessarily identical to, an activity of a 
polypeptide of the invention, including mature forms, as measured in a particular biological 
assay, with or without dose dependency. A nucleic acid fragment encoding a 

30 "biologically-active portion of NOVX" can be prepared by isolating a portion of SEQ ID 
NO:2n-l, wherein n is an integer between 1 and 124, that encodes a polypeptide having a 
NOVX biological activity (the biological activities of the NOVX proteins are described 

Q 
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t>elow), expressing the encoded portion of NOVX proteiif^:;^ rfeSQffibMM expression 
in vitro) and assessing the activity of the encoded portion of NOVX. 

NOVX Nucleic Acid and Polypeptide Variants 

The invention further encompasses nucleic acid molecules that differ from the 
nucleotide sequences of SEQ ID NO:2rc-l, wherein n is an integer between 1 and 124, due 
to degeneracy of the genetic code and thus encode the same NOVX proteins as that 
encoded by the nucleotide sequences of SEQ ID NO:2rc-l, wherein n is an integer between 
1 and 124. In another embodiment, an isolated nucleic acid molecule of the invention has a 
nucleotide sequence encoding a protein having an amino acid sequence of SEQ ID NO:2/z, 
wherein n is an integer between 1 and 124. 

In addition to the human NOVX nucleotide sequences of SEQ ID NO:2rc-l, wherein 
n is an integer between 1 and 124, it will be appreciated by those skilled in the art that 
DNA sequence polymorphisms that lead to changes in the amino acid sequences of the 
NOVX polypeptides may exist within a population (e.g., the human population). Such 
genetic polymoiphism in the NOVX genes may exist among individuals within a 
population due to natural allelic variation. As used herein, the terms "gene" and 
"recombinant gene" refer to nucleic acid molecules comprising an open reading frame 
(ORF) encoding a NOVX protein, preferably a vertebrate NOVX protein. Such natural 
allelic variations can typically result in 1-5% variance in the nucleotide sequence of the 
NOVX genes. Any and all such nucleotide variations and resulting amino acid 
polymorphisms in the NOVX polypeptides, which are the result of natural allelic variation 
and that do not alter the functional activity of the NOVX polypeptides, are intended to be 
within the scope of the invention. 

Moreover, nucleic acid molecules encoding NOVX proteins from other species, and 
thus that have a nucleotide sequence that differs from a human SEQ ID NO:2/i-l, wherein n 
is an integer between 1 and 124, are intended to be within the scope of the invention. 
Nucleic acid molecules corresponding to natural allelic variants and homologues of the 
NOVX cDNAs of the invention can be isolated based on their homology to the human 
NOVX nucleic acids disclosed herein using the human cDNAs, or a portion thereof, as a 
hybridization probe according to standard hybridization techniques under stringent 
hybridization conditions. 

Accordingly, in another embodiment, an isolated nucleic acid molecule of the 
invention is at least 6 nucleotides in length and hybridizes under stringent conditions to the 
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nucleic acid molecule comprising the nucleotide sequencFdfSEQ U3 T$Ot7H-i , Wftgrefn n 
is an integer between 1 and 124. In another embodiment, the nucleic acid is at least 10, 25, 
50, 100, 250, 500, 750, 1000, 1500, or 2000 or more nucleotides in length. In yet another 
embodiment, an isolated nucleic acid molecule of the invention hybridizes to the coding 
region. As used herein, the term "hybridizes under stringent conditions" is intended to 
describe conditions for hybridization and washing under which nucleotide sequences at 
least about 65% homologous to each other typically remain hybridized to each other. 

Homologs (i.e., nucleic acids encoding NOVX proteins derived from species other 
than human) or other related sequences (e.g., paralogs) can be obtained by low, moderate or 
high stringency hybridization with all or a portion of the particular human sequence as a 
probe using methods well known in the art for nucleic acid hybridization and cloning. 

As used herein, the phrase "stringent hybridization conditions" refers to conditions 
under which a probe, primer or oligonucleotide will hybridize to its target sequence, but to 
no other sequences. Stringent conditions are sequence-dependent and will be different in 
different circumstances. Longer sequences hybridize specifically at higher temperatures 
than shorter sequences. Generally, stringent conditions are selected to be about 5 °C lower 
than the thermal melting point (Tm) for the specific sequence at a defined ionic strength 
and pH. The Tm is the temperature (under defined ionic strength, pH and nucleic acid 
concentration) at which 50% of the probes complementary to the target sequence hybridize 
to the target sequence at equilibrium. Since the target sequences are generally present at 
excess, atTm, 50% of the probes are occupied at equilibrium. Typically, stringent 
conditions will be those in which the salt concentration is less than about 1.0 M sodium ion, 
typically about 0.01 to 1 .0 M sodium ion (or other salts) at pH 7.0 to 8.3 and the 
temperature is at least about 30 °C for short probes, primers or oligonucleotides (e.g., 10 nt 
to 50 nt) and at least about 60 °C for longer probes, primers and oligonucleotides. 
Stringent conditions may also be achieved with the addition of destabilizing agents, such as 
formamide. 

Stringent conditions are known to those skilled in the art and can be found in 
Ausubel, et aU (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, 
N.Y. (1989), 6.3.1-6.3.6. Preferably, the conditions are such that sequences at least about 
65%, 70%, 75%, 85%, 90%, 95%, 98%, or 99% homologous to each other typically remain 
hybridized to each other. A non-limiting example of stringent hybridization conditions are 
hybridization in a high salt buffer comprising 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM 
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*EDTA, 0.02% PVP, 0.02%'Ficoll, 0.02% BSA, and 500ft^fnf (fenWefiY^^ 
DNA at 65°C, followed by one or more washes in 0.2X SSC, 0.01% BSA at 50°C. An 
isolated nucleic acid molecule of the invention that hybridizes under stringent conditions to 
a sequence of SEQ ID NO:2rc-l, wherein n is an integer between 1 and 124, corresponds to 
5 a naturally-occurring nucleic acid molecule. As used herein, a "naturally-occurring" 
nucleic acid molecule refers to an RNA or DNA molecule having a nucleotide sequence 
that occurs in nature (e.g., encodes a natural protein). 

In a second embodiment, a nucleic acid sequence that is hybridizable to the nucleic 
acid molecule comprising the nucleotide sequence of SEQ ID NO:2rc-l, wherein is an 

10 integer between 1 and 124, or fragments, analogs or derivatives thereof, under conditions of 
moderate stringency is provided. A non-limiting example of moderate stringency 
hybridization conditions are hybridization in 6X SSC, 5X Reinhardt's solution, 0.5% SDS 
and 100 mg/ml denatured salmon sperm DNA at 55 °C, followed by one or more washes in 
IX SSC, 0.1% SDS at 37 °C. Other conditions of moderate stringency that may be used 

15 are well-known within the art. See, e.g., Ausubel, et al. (eds.), 1993, Current Protocols 
in Molecular Biology, John Wiley & Sons, NY, and Krieger, 1990; Gene Transfer 
and Expression, A Laboratory Manual, Stockton Press, NY. 

In a third embodiment, a nucleic acid that is hybridizable to the nucleic acid 
molecule comprising the nucleotide sequences of SEQ ID NO:2n-l, wherein n is an integer 

20 between 1 and 124, or fragments, analogs or derivatives thereof, under conditions of low 
stringency, is provided. A non-limiting example of low stringency hybridization conditions 
are hybridization in 35% formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 
0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 mg/ml denatured salmon sperm DNA, 10% 
(wt/vol) dextran sulfate at 40°C, followed by one or more washes in 2X SSC, 25 mM 

25 Tris-HCl (pH 7.4), 5 mM EDTA, and 0. 1% SDS at 50°C. Other conditions of low 

stringency that may be used are well known in the art (e.g., as employed for cross-species 
hybridizations). See, e.g., Ausubel, et ah (eds.), 1993, Current Protocols in 
Molecular Biology, John Wiley & Sons, NY, and Kriegler, 1990, Gene Transfer and 
Expression, A Laboratory Manual, Stockton Press, NY; Shilo and Weinberg, 1981. 

30 Proc Natl Acad Sci USA IS: 6789-6792. 

Conservative Mutations 

In addition to naturally-occurring allelic variants of NOVX sequences that may 
exist in the population, the skilled artisan will further appreciate that changes can be 
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'introduced by mutation into the nucleotide sequences of SE^ID f*OQjf-'l? ^her^inlti^aSi 
integer between 1 and 124, thereby leading to changes in the amino acid sequences of the 
encoded NOVX protein, without altering the functional ability of that NOVX protein. For 
example, nucleotide substitutions leading to amino acid substitutions at "non-essential" 
5 amino acid residues can be made in the sequence of SEQ ID NO:2n, wherein n is an integer 
between 1 and 124. A "non-essential" amino acid residue is a residue that can be altered 
from the wild-type sequences of the NOVX proteins without altering their biological 
activity, whereas an "essential" amino acid residue is required for such biological activity. 
For example, amino acid residues that are conserved among the NOVX proteins of the 

10 invention are predicted to be particularly non-amenable to alteration. Amino acids for 
which conservative substitutions can be made are well-known within the art. 

Another aspect of the invention pertains to nucleic acid molecules encoding NOVX 
proteins that contain changes in amino acid residues that are not essential for activity. Such 
NOVX proteins differ in amino acid sequence from SEQ ID NO:2n-l, wherein n is an 

15 integer between 1 and 124, yet retain biological activity. In one embodiment, the isolated 
nucleic acid molecule comprises a nucleotide sequence encoding a protein, wherein the 
protein comprises an amino acid sequence at least about 40% homologous to the amino 
acid sequences of SEQ ID NO:2n, wherein n is an integer between 1 and 124. Preferably, 
the protein encoded by the nucleic acid molecule is at least about 60% homologous to SEQ 

20 ID NO:2rc, wherein n is an integer between 1 and 124; more preferably at least about 70% 
homologous to SEQ ID NO:2/i, wherein n is an integer between 1 and 124; still more 
preferably at least about 80% homologous to SEQ ID NO:2rc, wherein n is an integer 
between 1 and 124; even more preferably at least about 90% homologous to SEQ ID 
NO:2n, wherein n is an integer between 1 and 124; and most preferably at least about 95% 

25 homologous to SEQ ID NO:2/i, wherein n is an integer between 1 and 124 1 . 

An isolated nucleic acid molecule encoding a NOVX protein homologous to the 
protein of SEQ ID NO:2n, wherein n is an integer between 1 and 124, can be created by 
introducing one or more nucleotide substitutions, additions or deletions into the nucleotide 
sequence of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124, such that one or 

30 more amino acid substitutions, additions or deletions are introduced into the encoded 
protein. 

Mutations can be introduced any one of SEQ ID NO:2w-l, wherein n is an integer 
between 1 and 124, by standard techniques, such as site-directed mutagenesis and 
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PCR-mediated mutagenesis. Preferably, conservative artintTadid sfflKflmtSB&S aitf ifiHfe'al 
one or more predicted, non-essential amino acid residues. A "conservative amino acid 
substitution" is one in which the amino acid residue is replaced with an amino acid residue 
having a similar side chain. Families of amino acid residues having similar side chains 
5 have been defined within the art. These families include amino acids with basic side chains 
(e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), 
uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, 
tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, 
phenylalanine, methionine, tryptophan), beta-branched side chains (e.g., threonine, valine, 

10 isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). 
Thus, a predicted non-essential amino acid residue in the NOVX protein is replaced with 
another amino acid residue from the same side chain family. Alternatively, in another 
embodiment, mutations can be introduced randomly along all or part of a NOVX coding 
sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for 

15 NOVX biological activity to identify mutants that retain activity. Following mutagenesis 
of a nucleic acid of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124, the 
encoded protein can be expressed by any recombinant technology known in the art and the 
activity of the protein can be determined. 

The relatedness of amino acid families may also be determined based on side chain 

20 interactions. Substituted amino acids may be fully conserved "strong" residues or fully 
conserved "weak" residues. The "strong" group of conserved amino acid residues may be 
any one of the following groups: STA, NEQK, NHQK, NDEQ, QHRK, MELV, MDLF, HY, 
FYW, wherein the single letter amino acid codes are grouped by those amino acids that 
may be substituted for each other. Likewise, the "weak" group of conserved residues may 

25 be any one of the following: CS A, ATV, SAG, STNK, STPA, SGND, SNDEQK, 

NDEQHK, NEQHRK, HFY, wherein the letters within each group represent the single 
letter amino acid code. 

In one embodiment, a mutant NOVX protein can be assayed for (i) the ability to 
form protein :protein interactions with other NOVX proteins, other cell-surface proteins, or 

30 biologically-active portions thereof, (it) complex formation between a mutant NOVX 
protein and a NOVX ligand; or (Hi) the ability of a mutant NOVX protein to bind to an 
intracellular target protein or biologically-active portion thereof; (e.g. avidin proteins). 

25 



WO 03/029424 PCT/US02/31373 

In yet another embodiment, a mutant NOVX protSfiterf Be MsSyl^Oir tftd' zhffilf 
to regulate a specific biological function (e.g., regulation of insulin release). 

Interfering RNA 

In one aspect of the invention, NOVX gene expression can be attenuated by RNA 
5 interference. One approach well-known in the art is short interfering RNA (siRNA) 
mediated gene silencing where expression products of a NOVX gene are targeted by 
specific double stranded NOVX derived siRNA nucleotide sequences that are 
complementary to at least a 19-25 nt long segment of the NOVX gene transcript, including 
the 5' untranslated (UT) region, the ORF, or the 3' UT region. See, e.g., PCT applications 

10 WO00/44895, W099/32619, WO01/75164, WO01/92513, WO 01/29058, WO01/89304, 
WO02/16620, and WO02/29858, each incorporated by reference herein in their entirety. 
Targeted genes can be a NOVX gene, or an upstream or downstream modulator of the 
NOVX gene. Nonlimiting examples of upstream or downstream modulators of a NOVX 
gene include, e.g., a transcription factor that binds the NOVX gene promoter, a kinase or 

15 phosphatase that interacts with a NOVX polypeptide, and polypeptides involved in a 
NOVX regulatory pathway. 

According to the methods of the present invention, NOVX gene expression is 
silenced using short interfering RNA. A NOVX polynucleotide according to the invention 
includes a siRNA polynucleotide. Such a NOVX siRNA can be obtained using a NOVX 

20 polynucleotide sequence, for example, by processing the NOVX ribopolynucleotide 
sequence in a cell-free system, such as but not limited to a Drosophila extract, or by 
transcription of recombinant double stranded NOVX RNA or by chemical synthesis of 

t 

nucleotide sequences homologous to a NOVX sequence. See, e.g., Tuschl, Zamore, 
Lehmann, Bartel and Sharp (1999), Genes &Dev. 13: 3191-3197, incorporated herein by 

25 reference in its entirety. When synthesized, a typical 0.2 micromolar-scale RNA synthesis 
provides about 1 milligram of siRNA, which is sufficient for 1000 transfection experiments 
using a 24-well tissue culture plate format. 

The most efficient silencing is generally observed with siRNA duplexes composed 
of a 21-nt sense strand and a 21-nt antisense strand, paired in a manner to have a 2~nt 

30 3 r overhang. The sequence of the 2-nt 3' overhang makes an additional small contribution 
to the specificity of siRNA target recognition. The contribution to specificity is localized to 
the unpaired nucleotide adjacent to the first paired bases. In one embodiment, the 
nucleotides in the 3' overhang are ribonucleotides. In an alternative embodiment, the 
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nucleotides in the 3* overhang are deoxyribonucleotides."USilig 2-d^yiW 

the 3' overhangs is as efficient as using ribonucleotides, but deoxyribonucleotides are often 

cheaper to synthesize and are most likely more nuclease resistant. 

A contemplated recombinant expression vector of the invention comprises a NOVX 
DNA molecule cloned into an expression vector comprising operati vely-linked regulatory 
sequences flanking the NOVX sequence in a manner that allows for expression (by 
transcription of the DNA molecule) of both strands. An RNA molecule that is antisense to 
NOVX mRNA is transcribed by a first promoter (e.g., a promoter sequence 3' of the cloned 
DNA) and an RNA molecule that is the sense strand for the NOVX mRNA is transcribed 
by a second promoter (e.g., a promoter sequence 5' of the cloned DNA). The sense and 
antisense strands may hybridize in vivo to generate siRNA constructs for silencing of the 
NOVX gene. Alternatively, two constructs can be utilized to create the sense and 
anti-sense strands of a siRNA construct. Finally, cloned DNA can encode a construct 
having secondary structure, wherein a single transcript has both the sense and 
complementary antisense sequences from the target gene or genes. In an example of this 
embodiment, a hairpin RNAi product is homologous to all or a portion of the target gene. 
In another example, a hairpin RNAi product is a siRNA. The regulatory sequences 
flanking the NOVX sequence may be identical or may be different, such that their 
expression may be modulated independently, or in a temporal or spatial manner. 

In a specific embodiment, siRNAs are transcribed intracellular^ by cloning the 
NOVX gene templates into a vector containing, e.g., a RNA pol III transcription unit from 
the smaller nuclear RNA (snRNA) U6 or the human RNase P RNA HI. One example of a 
vector system is the GeneSuppressor™ RNA Interference kit (commercially available from 
Imgenex). The U6 and HI promoters are members of the type m class of Pol HI promoters. 
The + 1 nucleotide of the U6-like promoters is always guanosine, whereas the +1 for HI 
promoters is adenosine. The termination signal for these promoters is defined by five 
consecutive thymidines. The transcript is typically cleaved after the second uridine. 
Cleavage at this position generates a 3* UU overhang in the expressed siRNA, which is 
similar to the 3' overhangs of synthetic siRNAs. Any sequence less than 400 nucleotides in - 
length can be transcribed by these promoter, therefore they are ideally suited for the 
expression of around 21-nucleotide siRNAs in, e.g., an approximately 50-nucleotide RNA 
stem-loop transcript. 
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A siRNA vector appears to have an advantage ovSr SjhlhStiKfi^^ 
term knock-down of expression is desired. Cells transfected with a siRNA expression 
vector would experience steady, long-term mRNA inhibition. In contrast, cells transfected 
with exogenous synthetic siRNAs typically recover from mRNA suppression within seven 
5 days or ten rounds of cell division. The long-term gene silencing ability of siRNA 
expression vectors may provide for applications in gene therapy. 

In general, siRNAs are chopped from longer dsRNA by an ATP-dependent 
ribonuclease called DICER. DICER is a member of the RNase HI family of 
double-stranded RNA-specific endonucleases. The siRNAs assemble with cellular proteins 

10 into an endonuclease complex. In vitro studies in Drosophila suggest that the 

siRNAs/protein complex (siRNP) is then transferred to a second enzyme complex, called 
an RNA-induced silencing complex (RISC), which contains an endoribonuclease that is 
distinct from DICER. RISC uses the sequence encoded by the antisense siRNA strand to 
find and destroy mRNAs of complementary sequence. The siRNA thus acts as a guide, 

15 restricting the ribonuclease to cleave only mRNAs complementary to one of the two siRNA 
strands. 

A NOVX mRNA region to be targeted by siRNA is generally selected from a 
desired NOVX sequence beginning 50 tolOO nt downstream of the start codon. 
Alternatively, 5' or 3* UTRs and regions nearby the start codon can be used but are 

20 generally avoided, as these may be richer in regulatory protein binding sites. UTR-binding 
proteins and/or translation initiation complexes may interfere with binding of the siRNP or 
RISC endonuclease complex. An initial BLAST homology search for the selected siRNA 
sequence is done against an available nucleotide sequence library to ensure that only one 
gene is targeted. Specificity of target recognition by siRNA duplexes indicate that a single 

25 point mutation located in the paired region of an siRNA duplex is sufficient to abolish 
target mRNA degradation. See, Elbashire/ ah 2001 EMBO J. 20(23):6877-88. Hence, 
consideration should be taken to accommodate SNPs, polymorphisms, allelic variants or 
species-specific variations when targeting a desired gene. 

In one embodiment, a complete NOVX siRNA experiment includes the proper 

30 negative control. A negative control siRNA generally has the same nucleotide composition 
as the NOVX siRNA but lack significant sequence homology to the genome. Typically, 
one would scramble the nucleotide sequence of the NOVX siRNA and do a homology 
search to make sure it lacks homology to any other gene. 
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Two independent NOVX siRNA duplexes can be Ji us%a fo1b\Jc1?Mwn a fSrger 
NOVX gene. This helps to control for specificity of the silencing effect. In addition, 
expression of two independent genes can be simultaneously knocked down by using equal 
concentrations of different NOVX siRNA duplexes, e.g., a NOVX siRNA and an siRNA 
5 for a regulator of a NOVX gene or polypeptide. Availability of siRNA-associating proteins 
is believed to be more limiting than target mRNA accessibility. 

A targeted NOVX region is typically a sequence of two adenines (AA) and two 
thymidines (TT) divided by a spacer region of nineteen (N19) residues {e.g., AA(N19)TT). 
A desirable spacer region has a G/C-content of approximately 30% to 70%, and more 

10 preferably of about 50%. If the sequence AA(N19)TT is not present in the target sequence, 
. an alternative target region would be AA(N21). The sequence of the NOVX sense siRNA 
corresponds to (N19)TT or N21, respectively. In the latter case, conversion of the 3 f end of 
the sense siRNA to TT can be performed if such a sequence does not naturally occur in the 
NOVX polynucleotide. The rationale for this sequence conversion is to generate a 

15 symmetric duplex with respect to the sequence composition of the sense and antisense 3' 
overhangs. Symmetric 3' overhangs may help to ensure that the siRNPs are formed with 
approximately equal ratios of sense and antisense target RNA-cleaving siRNPs. See, e.g., 
Elbashir, Lendeckel and Tuschl (2001). Genes & Dev. 15: 188-200, incorporated by 
reference herein in its entirely. The modification of the overhang of the sense sequence of 

20 the siRNA duplex is not expected to affect targeted mRNA recognition, as the antisense 
siRNA strand guides target recognition. 

Alternatively, if the NOVX target mRNA does not contain a suitable AA(N21) 
sequence, one may search for the sequence NA(N21). Further, the sequence of the sense 
strand and antisense strand may still be synthesized as 5 f (N19)TT, as it is believed that the 

25 sequence of the 3-most nucleotide of the antisense siRNA does not contribute to 

specificity. Unlike antisense or ribozyme technology, the secondary structure of the target 
mRNA does not appear to have a strong effect on silencing. See, Harborth, et al. (2001) J. 
Cell Science 114: 4557-4565, incorporated by reference in its entirety. 

Transfection of NOVX siRNA duplexes can be achieved using standard nucleic 

30 acid transfection methods, for example, OLIGOFECTAMBNE Reagent (commercially 
available from Invitrogen). An assay for NOVX gene silencing is generally performed 
approximately 2 days after transfection. No NOVX gene silencing has been observed in 
the absence of transfection reagent, allowing for a comparative analysis of the wild-type 
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and silenced NO VX phenotypes. In a specific embodimehCfor dnewell of a 24-welT plate, 
approximately 0.84 fig of the siRNA duplex is generally sufficient. Cells are typically 
seeded the previous day, and are transfected at about 50% confluence. The choice of cell 
culture media and conditions are routine to those of skill in the art, and will vary with the 
5 choice of cell type. The efficiency of transfection may depend on the cell type, but also on 
the passage number and the confluency of the cells. The time and the manner of formation 
of siRNA-liposome complexes (e.g. inversion versus vortexing) are also critical. Low 
transfection efficiencies are the most frequent cause of unsuccessful NOVX silencing. The 
efficiency of transfection needs to be carefully examined for each new cell line to be used. 

10 Preferred cell are derived from a mammal, more preferably from a rodent such as a rat or 
mouse, and most preferably from a human. Where used for therapeutic treatment, the cells 
are preferentially autologous, although non-autologous cell sources are also contemplated 
as within the scope of the present invention. 

For a control experiment, transfection of 0.84 fig single-stranded sense NOVX 

15 siRNA will have no effect on NOVX silencing, and 0.84 fig antisense siRNA has a weak 
silencing effect when compared to 0.84 fig of duplex siRNAs. Control experiments again 
allow for a comparative analysis of the wild-type and silenced NOVX phenotypes. To 
control for transfection efficiency, targeting of common proteins is typically performed, for 
example targeting of lamin A/C or transfection of a CMV-driven EGEP-expression plasmid 

20 (e.g. commercially available from Clontech). In the above example, a determination of the 
fraction of lamin A/C knockdown in cells is determined the next day by such techniques as 
immunofluorescence, Western blot, Northern blot or other similar assays for protein 
expression or gene expression. Lamin A/C monoclonal antibodies may be obtained from 
Santa Cruz Biotechnology. 

25 Depending on the abundance and the half life (or turnover) of the targeted NOVX 

polynucleotide in ai cell, a knock-down phenotype may become apparent after 1 to 3 days, 
or even later. In cases where no NOVX knock-down phenotype is observed, depletion of 
the NOVX polynucleotide may be observed by immunofluorescence or Western blotting. 
If the NOVX polynucleotide is still abundant after 3 days, cells need to be split and 

30 transferred to a fresh 24-well plate for re-transfection. If no knock-down of the targeted 
protein is observed, it may be desirable to analyze whether the target mRNA (NOVX or a 
NOVX upstream or downstream gene) was effectively destroyed by the transfected siRNA 
duplex. Two days after transfection, total RNA is prepared, reverse transcribed using a 
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target-specific primer, and PCR-amplified with a primer pair covering at least one v 
exon-exon junction in order to control for amplification of pre-mRNAs. RT/PCR of a 
non-targeted mRNA is also needed as control. Effective depletion of the mRNA yet 
undetectable reduction of target protein may indicate that a large reservoir of stable NOVX 
protein may exist in the cell. Multiple transfection in sufficiently long intervals may be 
necessary until the target protein is finally depleted to a point where a phenotype may 
become apparent. If multiple transfection steps are required, cells are split 2 to 3 days after 
• transfection. The cells may be transfected immediately after splitting. 

An inventive therapeutic method of the invention contemplates administering a 
NOVX siRNA construct as therapy to compensate for increased or aberrant NOVX 
expression or activity. The NOVX ribopolynucleotide is obtained and processed into 
siRNA fragments, or a NOVX siRNA is synthesized, as described above. The NOVX 
siRNA is administered to cells or tissues using known nucleic acid transfection techniques, 
as described above. A NOVX siRNA specific for a NOVX gene will decrease or 
knockdown NOVX transcription products, which will lead to reduced NOVX polypeptide 
production, resulting in reduced NOVX polypeptide activity in the cells or tissues. 

The present invention also encompasses a method of treating a disease or condition 
associated with the presence of a NOVX protein in an individual comprising administering 
to the individual an RNAi construct that targets the mRNA of the protein (the mRNA that 
encodes the protein) for degradation. A specific RNAi construct includes a siRNA or a 
double stranded gene transcript that is processed into siRNAs. Upon treatment, the target 
protein is not produced or is not produced to the extent it would be in the absence of the 
treatment. 

Where the NOVX gene function is not correlated with a known phenotype, a 
control sample of cells or tissues from healthy individuals provides a reference standard for 
determining NOVX expression levels. Expression levels are detected using the assays 
described, e.g., RT-PCR, Northern blotting, Western blotting, EUSA, and the like. A 
subject sample of cells or tissues is taken from a mammal, preferably a human subject, 
suffering from a disease state. The NOVX ribopolynucleotide is used to produce siRNA 
constructs, that are specific for the NOVX gene product. These cells or tissues are treated 
by administering NOVX siRNA's to the cells or tissues by methods described for the 
transfection of nucleic acids into a cell or tissue, and a change in NOVX polypeptide or 
polynucleotide expression is observed in the subject sample relative to the control sample, 
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using the assays described. This JNOVX gene knockdowfl apprb&W'pf6Vm6T£ rapid"", 
method for determination of a NOVX minus (NOVX) phenotype in the treated subject 
sample. The NOVX* phenotype observed in the treated subject sample thus serves as a 
marker for monitoring the course of a disease state during treatment. 
5 In specific embodiments, a NOVX siRNA is used in therapy. Methods for the 

generation and use of a NOVX siRNA are known to those skilled in the art. Example 
techniques are provided below. 

Production of RNAs 

Sense RNA (ssRNA) and antisense RNA (asRNA) of NOVX are produced using 
10 known methods such as transcription in RNA expression vectors. In the initial 

experiments, the sense and antisense RNA are about 500 bases in length each. The 
produced ssRNA and asRNA (0.5 pM) in 10 mM Tris-HCl (pH 7.5) with 20 mM NaCl 
were heated to 95° C for 1 min then cooled and annealed at room temperature for 12 to 16 
h. The RNAs are precipitated and resuspended in lysis buffer (below). To monitor 
15 annealing, RNAs are electrophoresed in a 2% agarose gel in TBE buffer and stained with 
ethidium bromide. See, e.g., Sambrook et al., Molecular Cloning. Cold Spring Harbor 
Laboratory Press, Plainview, N.Y. (1989). 

Lysate Preparation 

Untreated rabbit reticulocyte lysate (Ambion) are assembled according to the 
20 manufacturer's directions. dsRNA is incubated in the lysate at 30° C for 10 min prior to the 
addition of mRNAs. Then NOVX mRNAs are added and the incubation continued for an 
additional 60 min. The molar ratio of double stranded RNA and mRNA is about 200: 1 . 
k The NOVX mRNA is radiolabeled (using known techniques) and its stability is monitored 
by gel electrophoresis. 

25 In a parallel experiment made with the same conditions, the double stranded RNA is 

internally radiolabeled with a 32 P-ATP. Reactions are stopped by the addition of 2 X 
proteinase K buffer and deproteinized as described previously (Tuschl et al: y Genes Dev., 
13:3191-3197 (1999)). Products are analyzed by electrophoresis in 15% or 18% 
polyacrylamide sequencing gels using appropriate RNA standards. By monitoring the gels 

30 for radioactivity, the natural production of 10 to 25 nt RNAs from the double stranded 
RNA can be determined. 
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The band of double stranded RNA, about 21-23 bps^Ts feltidSarm w effica«y a 6f d 
these 21-23 mers for suppressing NOVX transcription is assayed in vitro using the same 
rabbit reticulocyte assay described above using 50 nanomolar of double stranded 21-23 mer 
for each assay. The sequence of these 21-23 mers is then determined using standard 
5 nucleic acid sequencing techniques. 

RNA Preparation 

21 nt RNAs, based on the sequence determined above, are chemically synthesized 
using Expedite RNA phosphoramidites and thymidine phosphoramidite (Proligo, 
Germany). Synthetic oligonucleotides are deprotected and gel-purified (Elbashir, 
10 Lendeckel, & Tuschl, Genes & Dev. 15, 188-200 (2001)), followed by Sep-Pak C18 
cartridge (Waters, Milford, Mass., USA) purification (Tuschl, et al., Biochemistry, 
32:11658-11668(1993)). 

These RNAs (20 jiM) single strands are incubated in annealing buffer (100 mM 
potassium acetate, 30 mM HEPES-KOH at pH 7.4, 2 mM magnesium acetate) for 1 min at 
15 90° C followed by 1 h at 37° C. 

Cell Culture 

A cell culture known in the art to regularly express NOVX is propagated using 
standard conditions. 24 hours before transfection, at approx. 80% confluency, the cells are 
trypsinized and diluted 1:5 with fresh medium without antibiotics (1-3 X 105 cells/ml) and 

20 transferred to 24-well plates (500 ml/well), Transfection is performed using a 

commercially available lipofection kit and NOVX expression is monitored using standard 
techniques with positive and negative control. A positive control is cells that naturally 
express NOVX while a negative control is cells that do not express NOVX. Base-paired 21 
and 22 nt siRNAs with overhanging 3 f ends mediate efficient sequence-specific mRNA 

25 degradation in Iysates and in cell culture. Different concentrations of siRNAs are used. An 
efficient concentration for suppression in vitro in mammalian culture is between 25 nM to 
100 nM final concentration. This indicates that siRNAs are effective at concentrations that 
are several orders of magnitude below the concentrations applied in conventional antisense 
or ribozyme gene targeting experiments. 

30 The above method provides a way both for the deduction of NOVX siRNA 

sequence and the use of such siRNA for in vitro suppression. In vivo suppression may be 
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performed using the same siRNA using well known in vi^oT^sfec^VdTgErfe tK'efSpy a 
transfection techniques. 

Antisense Nucleic Acids 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules 
5 that are hybridizable to or complementary to the nucleic acid molecule comprising the 
nucleotide sequence of SEQ ID NO:2/z-l, wherein n is an integer between 1 and 124, or 
fragments, analogs or derivatives thereof. An "antisense" nucleic acid comprises a 
nucleotide sequence that is complementary to a "sense" nucleic acid encoding a protein 
(e.g., complementary to the coding strand of a double-stranded cDNA molecule or 

10 complementary to an mRNA sequence). In specific aspects, antisense nucleic acid 

molecules are provided that comprise a sequence complementary to at least about 10, 25, 
50, 100, 250 or 500 nucleotides or an entire NOVX coding strand, or to only a portion 
thereof. Nucleic acid molecules encoding fragments, homologs, derivatives and analogs of 
a NOVX protein of SEQ ID NO:2n, wherein n is an integer between 1 and 124, or 

15 antisense nucleic acids complementary to a NOVX nucleic acid sequence of SEQ ID 
NO:2rc-l, wherein n is an integer between 1 and 124, are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence encoding a NOVX protein. The term 
"coding region" refers to the region of the nucleotide sequence comprising codons which 

20 are translated into amino acid residues. In another embodiment, the antisense nucleic acid 
molecule is antisense to a "noncoding region" of the coding strand of a nucleotide sequence 
encoding the NOVX protein. The term "noncoding region" refers to 5 r and 3' sequences 
which flank the coding region that are not translated into amino acids (le. 9 also referred to 
as 5 f and 3 r untranslated regions). 

25 Given the coding strand sequences encoding the NOVX protein disclosed herein, 

antisense nucleic acids of the invention can be designed according to the rules of Watson 
and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be 
complementary to the entire coding region of NOVX mRNA, but more preferably is an 
oligonucleotide that is antisense to only a portion of the coding or noncoding region of 

30 NOVX mRNA. For example, the antisense oligonucleotide can be complementary to the 
region surrounding the translation start site of NOVX mRNA. An antisense 
oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 
nucleotides in length. An antisense nucleic acid of the invention can be constructed using 

34 



WO 03/029424 PCT/US02/31373 

chemical synthesis or enzymatic ligation reactions using pr6°ceduresTMo^ Tor . 

example, an antisense nucleic acid {e.g., an antisense oligonucleotide) can be chemically 
synthesized using naturally-occurring nucleotides or variously modified nucleotides 
designed to increase the biological stability of the molecules or to increase the physical 
5 stability of the duplex formed between the antisense and sense nucleic acids (e.g., 
phosphorothioate derivatives and acridine substituted nucleotides can be used). 

Examples of modified nucleotides that can be used to generate the antisense nucleic 
acid include: 5-fluorouracil, 5-bromouracil 7 5-chlorouracil, 5-iodouracil, hypoxanthine, 
xanthine, 4-acetylcytosine, 5-carboxymethylaminomethyl-2-thiouridine, 
10 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyluracil, dihydrouracil, 
beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 1-methylguanine, 

1- methylinosine, 2,2-dimethylguanine, 2-methyladenine, 2-methylguanine, 
5-methoxyuracil, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 
5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouraciI, 2-thiouracil, 

15 4-thiouracil, beta-D-mannosylqueosine, S'-methoxycarboxymethyluracil, 

2- methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 

3- (3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6»diaminopurine. Alternatively, 
20 the antisense nucleic acid can be produced biologically using an expression vector into 

which a nucleic acid has been subcloned in an antisense orientation (le., RNA transcribed 
from the inserted nucleic acid will be of an antisense orientation to a target nucleic acid of 
interest, described further in the following subsection). 

The antisense nucleic acid molecules of the invention are typically administered to a 

25 subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
genomic DNA encoding a NOVX protein to thereby inhibit expression of the protein (e.g., 
by inhibiting transcription and/or translation). The hybridization can be by conventional 
nucleotide complementarity to form a stable duplex, or, for example, in the case of an 
antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions 

30 in the major groove of the double helix. An example of a route of administration of 

antisense nucleic acid molecules of the invention includes direct injection at a tissue site. 
Alternatively, antisense nucleic acid molecules can be modified to target selected cells and 
then administered systemically. For example, for systemic administration, antisense 
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molecules can be modified such that they specifically binfftb^ebdptbf^'Cfatigen^ * 
expressed on a selected cell surface (e.g., by linking the antisense nucleic acid molecules to 
peptides or antibodies that bind to cell surface receptors or antigens). The antisense nucleic 
acid molecules can also be delivered to cells using the vectors described herein. To achieve 
5 sufficient nucleic acid molecules, vector constructs in which the antisense nucleic acid 
molecule is placed under the control of a strong pol II or pol HI promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is 
an a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual p-units, 
10 the strands run parallel to each other. See, e.g., Gaultier, et aL 9 1987. NucL Acids Res. 15: 
6625-6641. The antisense nucleic acid molecule can also comprise a 
2'-o-methylribonucleotide (See, e.g., Inoue, et al. 1987. Nucl Acids Res. 15: 6131-6148) or 
a chimeric RNA-DNA analogue (See, e.g., Inoue, et al, 1987. FEBSLett. 215: 327-330. 

Ribozymes and PNA Moieties 

15 Nucleic acid modifications include, by way of non-limiting example, modified 

bases, and nucleic acids whose sugar phosphate backbones are modified or derivatized. 
These modifications are carried out at least in part to enhance the chemical stability of the 
modified nucleic acid, such that they may be used, for example, as antisense binding 
nucleic acids in therapeutic applications in a subject. 

20 In one embodiment, an antisense nucleic acid of the invention is a ribozyme. 

Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of 
cleaving a single-stranded nucleic acid, such as an mRNA, to which they have a 
complementary region. Thus, ribozymes (e.g., hammerhead ribozymes as described in 
Haselhoff and Gerlach 1988. Nature 334: 585-591) can be used to catalytically cleave 

25 NO VX mRNA transcripts to thereby inhibit translation of NOVX mRNA. A ribozyme 
having specificity for a NOVX-encoding nucleic acid can be designed based upon the 
nucleotide sequence of a NOVX cDNA disclosed herein (i.e., SEQ ID NO:2n-l, wherein n 
is an integer between 1 and 124). For example, a derivative of a Tetrahymena L-19 IVS 
RNA can be constructed in which the nucleotide sequence of the active site is 

30 complementary to the nucleotide sequence to be cleaved in a NOVX-encoding mRNA. 
See, e.g., U.S. Patent 4,987,071 to Cech, et al. and U.S. Patent 5,116,742 to Cech, et al. 
NOVX mRNA can also be used to select a catalytic RNA having a specific ribonuclease 
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activity from a pool of RNA molecules. See, e.g., Bartel ffifc; (l$9$f&M5i' 
261:1411-1418. 

Alternatively, NOVX gene expression can be inhibited by targeting nucleotide 
sequences complementary to the regulatory region of the NOVX nucleic acid (e.g. , the 
5 NOVX promoter and/or enhancers) to form triple helical structures that prevent 

transcription of the NOVX gene in target cells. See, e.g., Helene, 1991. Anticancer Drug 
Des. 6: 569-84; Helene, et ah 1992. Ann. N.Y. Acad. Sci. 660: 27-36; Maher, 1992. 
Bioassays 14: 807-15. 

. In various embodiments, the NOVX nucleic acids can be modified at the base 

10 moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, 
or solubility of the molecule. For example, the deoxyribose phosphate backbone of the 
nucleic acids can be modified to generate peptide nucleic acids. See, e.g., Hyrup, et ah, 
1996. Bioorg MedChem 4: 5-23. As used herein, the terms "peptide nucleic acids" or 
"PNAs" refer to nucleic acid mimics (e.g., DNA mimics) in which the deoxyribose 

15 phosphate backbone is replaced by a pseudopeptide backbone and only the four natural 
nucleotide bases are retained. The neutral backbone of PNAs has been shown to allow for 
specific hybridization to DNA and RNA under conditions of low ionic strength. The 
synthesis of PNA oligomer can be performed using standard solid phase peptide synthesis 
protocols as described in Hyrup, et ah, 1996. supra; Perry-OTCeefe, et ah, 1996. Proc. Nath 

20 Acad. Sci. USA 93: 14670-14675. 

PNAs of NOVX can be used in therapeutic 'and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific 
modulation of gene expression by, e.g. , inducing transcription or translation arrest or 
inhibiting replication. PNAs of NOVX can also be used, for example, in the analysis of 

25 single base pair mutations in a gene (e.g., PNA directed PCR clamping; as artificial 

restriction enzymes when used in combination with other enzymes, e.g., S\ nucleases (See, 
Hyrup, et ah, 1 996. supra); or as probes or primers for DNA sequence and hybridization 
(See, Hyrup, et ah, 1996, supra; Perry-O'Keefe, et ah, 1996. supra). 

In another embodiment, PNAs of NOVX can be modified, e.g., to enhance their 

30 stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
delivery known in the art. For example, PNA-DNA chimeras of NOVX can be generated 
that may combine the advantageous properties of PNA and DNA. Such chimeras allow 
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DNA recognition enzymes (e.g., RNase H and DNA pol fS&f*k$) tl^ifflMffr^ithlthfe 0NA 
portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms of 
base stacking, number of bonds between the nucleotide bases, and orientation (see, Hyrup, 
et al., 1996. supra). The synthesis of PNA-DNA chimeras can be performed as described 
in Hyrup, et al, 1996. supra and Finn, et al, 1996. Nucl Acids Res 24: 3357-3363. For 
example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
5 , -(4-methoxytrity])arnino-5 t -deoxy-thyixiidine phosphoramidite, can be used between the 
PNA and the 5' end of DNA. See, e.g., Mag, et al, 1989. Nucl Acid Res 17: 5973-5988. 
PNA monomers are then coupled in a stepwise manner to produce a chimeric molecule 
with a 5* PNA segment and a 3' DNA segment. See, e.g., Finn, et al, 1996. supra. 
Alternatively, chimeric molecules can be synthesized with a 5 1 DNA segment and a 3* PNA 
segment. See, e.g., Petersen, et aL 9 1975. Bioorg. Med. Chem. Lett. 5: 1119-11124. 

In other embodiments, the oligonucleotide may include other appended groups such 
as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport 
across the cell membrane (see, e.g., Letsinger, et al, 1989. Proa Natl Acad. Sci. U.SA. 86: 
6553-6556; Lemaitre, et al, 1987. Proc. Natl. Acad. Set 84: 648-652; PCT Publication No. 
WO88/09810) or the blood-brain barrier (see, e.g., PCT Publication No. WO 89/10134). In 
addition, oligonucleotides can be modified with hybridization triggered cleavage agents 
(see, e.g., Krol, etal, 1988. BioTechniques 6:958-976) or intercalating agents (see, e.g., 
Zon, 1988. Pharm. Res. 5: 539-549). To this end, the oligonucleotide may be conjugated to 
another molecule, e.g., a peptide, a hybridization triggered cross-linking agent, a transport 
agent, a hybridization-triggered cleavage agent, and the like. 

NOVX Polypeptides 

A polypeptide according to the invention includes a polypeptide including the 
amino acid sequence of NOVX polypeptides whose sequences are provided in any one of 
SEQ ID NO:2rc, wherein n is an integer between 1 and 124. The invention also includes a 
mutant or variant protein any of whose residues may be changed from the corresponding 
residues shown in any one of SEQ ID NO:2n, wherein n is an integer between 1 and 124, 
while still encoding a protein that maintains its NOVX activities and physiological 
functions, or a functional fragment thereof. 
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In general, a INU VA variant that preserves NOVX^lilte fbrfcMirMc5<Wes afiy^aBaSif 
in which residues at a particular position in the sequence have been substituted by other 
amino acids, and further include the possibility of inserting an additional residue or 
residues between two residues of the parent protein as well as the possibility of deleting 
5 one or more residues from the parent sequence. Any amino acid substitution, insertion, or 
deletion is encompassed by the invention. In favorable circumstances, the substitution is a 
conservative substitution as defined above. 

One-aspect of the invention pertains to isolated NOVX proteins, and 
biologically-active portions thereof, or derivatives, fragments, analogs or homologs thereof. 

10 Also provided are polypeptide fragments suitable for use as immunogens to raise 

anti-NOVX antibodies. In one embodiment, native NOVX proteins can be isolated from 
cells or tissue sources by an appropriate purification scheme using standard protein 
purification techniques. In another embodiment, NOVX proteins are produced by 
recombinant DNA techniques. Alternative to recombinant expression, a NOVX protein or 

15 polypeptide can be synthesized chemically using standard peptide synthesis techniques. 

An "isolated" or "purified" polypeptide or protein or biologically-active portion 
thereof is substantially free of cellular material or other contaminating proteins from the 
cell or tissue source from which the NOVX protein is derived, or substantially free from 
chemical precursors or other chemicals when chemically synthesized. The language 

20 "substantially free of cellular material" includes preparations of NOVX proteins in which 
the protein is separated from cellular components of the cells from which it is isolated or 
recombinantly-produced. In one embodiment, the language "substantially free of cellular 
material" includes preparations of NOVX proteins having less than about 30% (by dry 
weight) of non-NOVX proteins (also referred to herein as a "contaminating protein"), more 

.25 preferably less than about 20% of non-NOVX proteins, still more preferably less than about 
10% of non-NOVX proteins, and most preferably less than about 5% of non-NOVX 
proteins. When the NOVX protein or biologically-active portion thereof is 
recombinantly-produced, it is also preferably substantially free of culture medium, Le. 9 
culture medium represents less than about 20%, more preferably less than about 10%, and 

30 most preferably less than about 5% of the volume of the NOVX protein preparation. 

The language "substantially free of chemical precursors or other chemicals" 
includes preparations of NOVX proteins in which the protein is separated from chemical 
precursors or other chemicals that are involved in the synthesis of the protein. In one 
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embodiment, the language "substantially free of chemical lI prSeuts*6rsfi6f 'fefflfel-rhenflcM^ 
includes preparations of NOVX proteins having less than about 30% (by dry weight) of 
chemical precursors or non-NOVX chemicals, more preferably less than about 20% 
chemical precursors or non-NOVX chemicals, still more preferably less than about 10% 
5 chemical precursors or non-NOVX chemicals, and most preferably less than about 5% 
chemical precursors or non-NOVX chemicals. 

Biologically-active portions of NOVX proteins include peptides comprising amino 
acid sequences sufficiently homologous to or derived from the amino acid sequences of the 
NOVX proteins (e.g., the amino acid sequence of SEQ ID NO:2/z, wherein n is an integer 

10 between 1 and 124) that include fewer amino acids than the full-length NOVX proteins, 
and exhibit at least one activity of a NOVX protein. Typically, biologically-active portions 
comprise a domain or motif with at least one activity of the NOVX protein. A 
biologically-active portion of a NOVX protein can be a polypeptide which is, for example, 
10, 25, 50, 100 or more amino acid residues in length. 

15 Moreover, other biologically-active portions, in which other regions of the protein 

are deleted, can be prepared by recombinant techniques and evaluated for one or more of 
the functional activities of a native NOVX protein. 

In an embodiment, the NOVX protein has an amino acid sequence of SEQ ID 
NO:2n, wherein n is an integer between 1 and 124. In other embodiments, the NOVX 

20 protein is substantially homologous to SEQ ID NO:2w, wherein n is an integer between 1 
and 124, and retains the functional activity of the protein of SEQ ID NO:2n, wherein n is 
an integer between 1 and 124, yet differs in amino acid sequence due to natural allelic 
variation or mutagenesis, as described in detail, below. Accordingly, in another 
embodiment, the NOVX protein is a protein that comprises an amino acid sequence at least 

25 about 45% homologous to the amino acid sequence of SEQ ID NO:2rc, wherein n is an 
integer between 1 and 124, and retains the functional activity of the NOVX proteins of 
SEQ ID NO:2n, wherein n is an integer between 1 and 124. 

Determining Homology Between Two or More Sequences 

To determine the percent homology of two amino acid sequences or of two nucleic 
30 acids, the sequences are aligned for optimal comparison purposes (e.g., gaps can be 
introduced in the sequence of a first amino acid or nucleic acid sequence for optimal 
alignment with a second amino or nucleic acid sequence). The amino acid residues or 
nucleotides at corresponding amino acid positions or nucleotide positions are then 
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compared. When a position in the first sequence is occupfetfbyfee MSS t aMSd*ae : fd : - 
residue or nucleotide as the corresponding position in the second sequence, then the 
molecules are homologous at that position (i.e. y as used herein amino acid or nucleic acid 
"homology 11 is equivalent to amino acid or nucleic acid "identity"). 
5 The nucleic acid sequence homology may be determined as the degree of identity 

between two sequences. The homology may be determined using computer programs 
known in the art, such as GAP software provided in the GCG program package. See, 
Needleman and Wunsch, 1970. JMol Biol 48: 443-453. Using GCG GAP software with 
the following settings for nucleic acid sequence comparison: GAP creation penalty of 5.0 

10 and GAP extension penalty of 0.3, the coding region of the analogous nucleic acid 

sequences referred to above exhibits a degree of identity preferably of at least 70%, 75%, 
80%, 85%, 90%, 95%, 98%, or 99%, with the CDS (encoding) part of the DNA sequence 
of SEQ ID NO:2/z-l, wherein n is an integer between 1 and 124. 

The term "sequence identity" refers to the degree to which two polynucleotide or 

15 polypeptide sequences are identical on a residue-by-residue basis over a particular region of 
comparison. The term "percentage of sequence identity" is calculated by comparing two 
optimally aligned sequences over that region of comparison, determining the number of 
positions at which the identical nucleic acid base {e.g., A, T, C, G, U, or I, in the case of 
nucleic acids) occurs in both sequences to yield the number of matched positions, dividing 

20 the number of matched positions by the total number of positions in the region of 
comparison {Le., the window size), and multiplying the result by 100 to yield the 
percentage of sequence identity. The term "substantial identity" as used herein denotes a 
characteristic of a polynucleotide sequence, wherein the polynucleotide comprises a 
sequence that has at least 80 percent sequence identity, preferably at least 85 percent 

25 identity and often 90 to 95 percent sequence identity, more usually at least 99 percent 
sequence identity as compared to a reference sequence over a comparison region. 

Chimeric and Fusion Proteins 

The invention also provides NOVX chimeric or fusion proteins. As used herein, a 
NOVX "chimeric protein" or "fusion protein" comprises a NOVX polypeptide 
30 operatively-linked to a non-NOVX polypeptide. An "NOVX polypeptide" refers to a 

polypeptide having an amino acid sequence corresponding to a NOVX protein of SEQ ID 
NO:2rc, wherein n is an integer between 1 and 124, whereas a "non-NOVX polypeptide" 
refers to a polypeptide having an amino acid sequence corresponding to a protein that is not 
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substantially homologous to the NOVX protein, e.g., a piSfeSfi Mt &*<Mb?£nlfro8i"&i6 
NOVX protein and that is derived from the same or a different organism. Within a NOVX 
fusion protein the NOVX polypeptide can correspond to all or a portion of a NOVX 
protein. In one embodiment, a NOVX fusion protein comprises at least one 
5 biologically-active portion of a NOVX protein. In another embodiment, a NOVX fusion 
protein comprises at least two biologically-active portions of a NOVX protein. In yet 
another embodiment, a NOVX fusion protein comprises at least three biologically-active 
portions of a NOVX protein. Within the fusion protein, the term "operatively-linked" is 
intended to indicate that the NOVX polypeptide and the non-NOVX polypeptide are fused 

10 in-frame with one another. The non-NOVX polypeptide can be fused to the N-terminus or 
C-terminus of the NOVX polypeptide. 

In one embodiment, the fusion protein is a GST-NO VX fusion protein in which the . 
NOVX sequences are fused to the C-terminus of the GST (glutathione S-transferase) 
sequences. Such fusion proteins can facilitate the purification of recombinant NOVX 

15 polypeptides. 

In another embodiment, the fusion protein is a NOVX protein containing a 
heterologous signal sequence at its N-terminus. In certain host cells (e.g., mammalian host 
cells), expression and/or secretion of NOVX can be increased through use of a 
heterologous signal sequence. 

20 In yet another embodiment, the fusion protein is a NOVX-immunoglobulin fusion 

protein in which the NOVX sequences are fused to sequences derived from a member of 
the immunoglobulin protein family. The NOVX-immunoglobulin fusion proteins of the 
invention can be incorporated into pharmaceutical compositions and administered to a 
subject to inhibit an interaction between a NOVX ligand and a NOVX protein on the 

25 surface of a cell, to thereby suppress NOVX-mediated signal transduction in vivo. The 
NOVX-immunoglobulin fusion proteins can be used to affect the bioavailability of a 
NOVX cognate ligand. Inhibition of the NOVX ligand/NOVX interaction may be useful 
therapeutically for both the treatment of proliferative and differentiative disorders, as well 
as modulating (e.g. promoting or inhibiting) cell survival. Moreover, the 

30 NOVX-immunoglobulin fusion proteins of the invention can be used as immunogens to 
produce anti-NOVX antibodies in a subject, to purify NOVX ligands, and in screening 
assays to identify molecules that inhibit the interaction of NOVX with a NOVX ligand. 
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A NOVX chimeric or fusion protein of the .inventiKzJb£| JBSfilSb/ sSaba% 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, e.g., by employing blunt-ended or stagger-ended termini for ligation, restriction 
5 enzyme digestion to provide for appropriate termini, filling-in of cohesive ends as 

appropriate, alkaline phosphatase treatment to avoid undesirable joining, and enzymatic 
ligation. In another embodiment, the fusion gene can be synthesized by conventional 
techniques including automated DNA synthesizers. Alternatively, PCR amplification of 
gene fragments can be carried out using anchor primers that give rise to complementary 

10 overhangs between two consecutive gene fragments that can subsequently be annealed and 
reamplified to generate a chimeric gene sequence {see, e.g., Ausubel, et aL (eds.) Current 
Protocols in Molecular Biology, John Wiley & Sons, 1992). Moreover, many 
expression vectors are commercially available that already encode a fusion moiety (e.g., a 
GST polypeptide). A NOVX-encoding nucleic acid can be cloned into such an expression 

15 vector such that the fusion moiety is linked in-frame to the NOVX protein. 

NOVX Agonists and Antagonists 

The invention also pertains to variants of the NOVX proteins that function as either 
NOVX agonists (£*., mimetics) or as NOVX antagonists. Variants of the NOVX protein 
can be generated by mutagenesis (e.g., discrete point mutation or truncation of the NOVX 

20 protein). An agonist of the NOVX protein can retain substantially the same, or a subset of, 
the biological activities of the naturally occurring form of the NOVX protein. An 
antagonist of the NOVX protein can inhibit one or more of the activities of the naturally 
occurring form of the NOVX protein by, for example, competitively binding to a 
downstream or upstream member of a cellular signaling cascade which includes the NOVX 

25 protein. Thus, specific biological effects can be elicited by treatment with a variant of 

limited function. In one embodiment, treatment of a subject with a variant having a subset 
of the biological activities of the naturally occurring form of the protein has fewer side 
effects in a subject relative to treatment with the naturally occurring form of the NOVX 
proteins. 

30 Variants of the NOVX proteins that function as either NOVX agonists (i.e., 

mimetics) or as NOVX antagonists can be identified by screening combinatorial libraries of 
mutants (e.g., truncation mutants) of the NOVX proteins for NOVX protein agonist or 
antagonist activity. In one embodiment, a variegated library of NOVX variants is 
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generated by combinatorial mutagenesis at the nucleic ac^l^i^flfiM^M^fetft^ A 
variegated gene library. A variegated library of NOVX variants can be produced by, for 
example, enzymatically ligating a mixture of synthetic oligonucleotides into gene 
sequences such that a degenerate set of potential NOVX sequences is expressible as 
individual polypeptides, or alternatively, as a set of larger fusion proteins (e.g., for phage 
display) containing the set of NOVX sequences therein. There are a variety of methods 
which can be used to produce libraries of potential NOVX variants from a degenerate 
oligonucleotide sequence. Chemical synthesis of a degenerate gene sequence can be 
performed in an automatic DNA synthesizer, and the synthetic gene then ligated into an 
appropriate expression vector. Use of a degenerate set of genes allows for the provision, in 
one mixture, of all of the sequences encoding the desired set of potential NOVX sequences. 
Methods for synthesizing degenerate oligonucleotides are well-known within the art. See, 
e.g., Narang, 1983. Tetrahedron 39: 3; Itakura, et aL 7 1984. Annu. Rev. Biochem. 53: 323; 
Itakura, era/., \9W.Science 198: 1056; Ike, et aL 9 1983. NucL Acids Res. 11:477. 

Polypeptide Libraries 

In addition, libraries of fragments of the NOVX protein coding sequences can be 
used to generate a variegated population of NOVX fragments for screening and subsequent 
selection of variants of a NOVX protein. In one embodiment, a library of coding sequence 
fragments can be generated by treating a double stranded PCR fragment of a NOVX coding 
sequence with a nuclease under conditions wherein nicking occurs only about once per 
molecule, denaturing the double stranded DNA, renaturing the DNA to form 
double-stranded DNA that can include sense/anti sense pairs from different nicked products, 
removing single, stranded portions from reformed duplexes by treatment with Si nuclease, 
and ligating the resulting fragment library into an expression vector. By this method, 
expression libraries can be derived which encodes N-terminal and internal fragments of 
various sizes of the NOVX proteins. 

Various techniques are known in the art for screening gene products of 
combinatorial libraries made by point mutations or truncation, and for screening cDNA 
libraries for gene products having a selected property. Such techniques are adaptable for 
rapid screening of the gene libraries generated by the combinatorial mutagenesis of NOVX 
proteins. The most widely used techniques, which are amenable to high throughput 
analysis, for screening large gene libraries typically include cloning the gene library into 
replicable expression vectors, transforming appropriate cells with the resulting library of 
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" vectors, and expressing tne combinatorial genes under condiltoflsin'^ftiW^^ 
desired activity facilitates isolation of the vector encoding the gene whose product was 
detected. Recursive ensemble mutagenesis (REM), a new technique that enhances the 
frequency of functional mutants in the libraries, can be used in combination with the 
5 screening assays to identify NOVX variants. See, e.g., Arkin and Yourvan, 1992. Proc. 
Natl Acad Sci. USA 89: 7811-7815; Delgrave, et al. y 1993. Protein Engineering 
6:327-331. 

Anti-NOVX Antibodies 

Included in the invention are antibodies to NOVX proteins, or fragments of NOVX 
10 proteins. The term "antibody" as used herein refers to immunoglobulin molecules and 
immunologically active portions of immunoglobulin (Ig) molecules, z.e., molecules that 
contain an antigen binding site that specifically binds (immunoreacts with) an antigen. 
Such antibodies include, but are not limited to, polyclonal, monoclonal, chimeric, single 
chain, F ab , F ab - and F^i fragments, and an F ab expression library. In general, antibody 
15 molecules obtained from humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, 
which differ from one another by the nature of the heavy chain present in the molecule. 
Certain classes have subclasses as well, such as IgG], IgG 2 , and others. Furthermore, in 
humans, the light chain may be a kappa chain or a lambda chain. Reference herein to 
antibodies includes a reference to all such classes, subclasses and types of human antibody 
20 species. 

An isolated protein of the invention intended to serve as an antigen, or a portion or 
fragment thereof, can be used as an immunogen to generate antibodies that 
immunospecifically bind the antigen, using standard techniques for polyclonal and 
monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 

25 invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid 
sequence of the full length protein, such as an amino acid sequence of SEQ ID NO:2/z, 
wherein n is an integer between 1 and 124, and encompasses an epitope thereof such that 
an antibody raised against the peptide forms a specific immune complex with the full 

30 length protein or with any fragment that contains the epitope. Preferably, the antigenic 
peptide comprises at least 10 amino acid residues, or at least 15 amino acid residues, or at 
least 20 amino acid residues, or at least 30 amino acid residues. Preferred epitopes 
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encompassed by the antigenic peptide are regions of the {Jrbfelrf tha^i^fe'SSted onfitl 
surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of NOVX that is located on the surface of the protein, e.g., a 
5 hydrophilic region. A hydrophobicity analysis of the human NOVX protein sequence will 
indicate which regions of a NOVX polypeptide are particularly hydrophilic and, therefore, 
are likely to encode surface residues useful for targeting antibody production. As a means 
for targeting antibody production, hydropathy plots showing regions of hydrophilicity and 
hydrophobicity may be generated by any method well known in the art, including, for 

10 example, the Kyte Doolittle or the Hopp Woods methods, either with or without Fourier 
transformation. See, e.g., Hopp and Woods, 1981, Proc. Nat Acad. Set USA 78: 
3824-3828; Kyte and Doolittle 1982, 7. Mol BioL 157: 105-142, each incorporated herein 
by reference in their entirety. Antibodies that are specific for one or more domains within 
an antigenic protein, or derivatives, fragments, analogs or homologs thereof, are also 

15 provided herein. 

The term "epitope" includes any protein determinant capable of specific binding to 
an immunoglobulin or T-cell receptor. Epitopic determinants usually consist of chemically 
active surface groupings of molecules such as amino acids or sugar side chains and usually 
have specific three dimensional structural characteristics, as well as specific charge 

20 characteristics. A NOVX polypeptide or a fragment thereof comprises at least one antigenic 
epitope. An anti-NOVX antibody of the present invention is said to specifically bind to 
antigen NOVX when the equilibrium binding constant (K D ) is <1 pM, preferably < 100 
nM, more preferably < 10 nM, and most preferably < 100 pM to about 1 pM, as measured 
by assays such as radioligand binding assays or similar assays known to those skilled in the 

25 art. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 
thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of 
30 polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, 
Antibodies: A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor 
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Laboratory Press, Cold Spring Harbor, NY, incoiporatedWeMri^b^ MRK^/SA^gP 
these antibodies are discussed below. 

Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., 
5 rabbit, goat, mouse or other mammal) may be immunized by one or more injections with 
the native protein, a synthetic variant thereof, or a derivative of the foregoing. An 
appropriate immunogenic preparation can contain, for example, the naturally occurring 
immunogenic protein, a chemically synthesized polypeptide representing the immunogenic 
protein, or a recombinantly expressed immunogenic protein. Furthermore, the protein may 

10 be conjugated to a second protein known to be immunogenic in the mammal being 
immunized. Examples of such immunogenic proteins include but are not limited to 
keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean trypsin 
inhibitor. The preparation can further include an adjuvant. Various adjuvants used to 
increase the immunological response include, but are not limited to, Freund's (complete and 

15 incomplete), mineral gels (e.g., aluminum hydroxide), surface active substances (e.g., 
lysolecithin, pluronic polyols/polyanions, peptides, oil emulsions, dinitrophenol, etc.), 
adjuvants usable in humans such as Bacille Calmette-Guerin and Corynebacterium parvum, 
or similar immunostimulatory agents. Additional examples of adjuvants which can be 
employed include MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 

20 dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as affinity chromatography using protein A or protein G, which provide 
primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 

25 antigen which is the target of the immunoglobulin sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by immunoaffmity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 
(April 17, 2000), pp. 25-28). 

30 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as 
used herein, refers to a population of antibody molecules that contain only one molecular 
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species of antibody molecule consisting of a unique light chain gene product and a unique 
heavy chain gene product. In particular, the complementarity determining regions (CDRs) 
of the monoclonal antibody are identical in all the molecules of the population. MAbs thus 
contain an antigen binding site capable of immunoreacting with a particular epitope of the 
5 antigen characterized by a unique binding affinity for it. s 
Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a 
mouse, hamster, or other appropriate host animal, is typically immunized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing antibodies 
10 that will specifically bind to the immunizing agent. Alternatively, the lymphocytes can be 
immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof 
or a fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells 
of human origin are desired, or spleen cells or lymph node cells are used if non-human 

15 mammalian sources are desired. The lymphocytes are then fused with an immortalized cell 
line using a suitable fusing agent, such as polyethylene glycol, to form a hybridoma cell 
(Goding, Monoclonal Antibodies: Principles and Practice , Academic Press, (1986) pp. 
59-103). Immortalized cell lines are usually transformed mammalian cells, particularly 
myeloma cells of rodent, bovine and human origin. Usually, rat or mouse myeloma cell 

20 lines are employed. The hybridoma cells can be cultured in a suitable culture medium that 
preferably contains one or more substances that inhibit the growth or survival of the 
unfused, immortalized cells. For example, if the parental cells lack the enzyme 
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture medium 
for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine ("HAT 

25 medium"), which substances prevent the growth of HGPRT-deficient cells. 

Preferred immortalized cell lines are those that fuse efficiently, support stable high 
level expression of antibody by the selected antibody-producing cells, and are sensitive to a 
medium such as HAT medium. More preferred immortalized cell lines are murine 
myeloma lines, which can be obtained, for instance, from the Salk Institute Cell 

30 Distribution Center, San Diego, California and the American Type Culture Collection, 
Manassas, Virginia. Human myeloma and mouse-human heteromyeloma cell lines also 
have been described for the production of human monoclonal antibodies (Kozbor, J. 
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Immunol., 133:3001 (1984); Brodeur et al., Monoclonal Antibody Production Techniques 
and Applications, Marcel Dekker, Inc., New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be assayed 
for the presence of monoclonal antibodies directed against the antigen. Preferably, the 
5 binding specificity of monoclonal antibodies produced by the hybridoma cells is 
determined by immunoprecipitation or by an in vitro binding assay, such as 
radioimmunoassay (RIA) or enzyme-linked immunoabsorbent assay (ELISA). Such 
techniques and assays are known in the art. The binding affinity of the monoclonal 
antibody can, for example, be determined by the Scatchard analysis of Munson and Pollard, 
10 Anal. Biochem., 107:220 (1980). It is an objective, especially important in therapeutic 
applications of monoclonal antibodies, to identify antibodies having a high degree of 
specificity and a high binding affinity for the target antigen. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods (Goding,1986). Suitable 
15 culture media for this purpose include, for example, Dulbecco's Modified Eagle's Medium 
and RPMI-1640 medium. Alternatively, the hybridoma cells can be grown in vivo as 
ascites in a mammal . 

The monoclonal antibodies secreted by the subclones can be isolated or purified 
from the culture medium or ascites fluid by conventional immunoglobulin purification 
20 procedures such as, for example, protein A-Sepharose, hydroxylapatite chromatography, 
gel electrophoresis, dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such 
as those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies 
of the invention can be readily isolated and sequenced using conventional procedures (e.g., 
25 by using oligonucleotide probes that are capable of binding specifically to genes encoding 
the heavy and light chains of murine antibodies). The hybridoma cells of the invention 
serve as a preferred source of such DNA. Once isolated, the DNA can be placed into 
expression vectors, which are then transfected into host cells such as simian COS cells, 
Chinese hamster ovary (CHO) cells, or myeloma cells that do not otherwise produce 
30 immunoglobulin protein, to obtain the synthesis of monoclonal antibodies in the 

recombinant host cells. The DNA also can be modified, for example, by substituting the 
coding sequence for human heavy and light chain constant domains in place of the 
homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 812-13 
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(1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
polypeptide can be substituted for the constant domains of an antibody of the invention, or 
can be substituted for the variable domains of one antigen-combining site of an antibody of 
5 the invention to create a chimeric bivalent antibody. 

Humanized Antibodies 

The antibodies directed against the protein antigens of the invention can further 
comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against 

10 the administered immunoglobulin. Humanized forms of antibodies are chimeric 

immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab*, 
F(ab") 2 or other antigen-binding subsequences of antibodies) that are principally comprised 
of the sequence of a human immunoglobulin, and contain minimal sequence derived from a 
non-human immunoglobulin. Humanization can be performed following the method of 

15 Winter and co-workers (Jones et al., Nature, 321 :522-525 (1986); Riechmann et al., Nature, 
332:323-327 (1988); Verhoeyen et al., Science, 239:1534-1536 (1988)), by substituting 
rodent CDRs or CDR sequences for the corresponding sequences of a human antibody. 
(See also U.S. Patent No. 5,225,539.) In some instances, Fv framework residues of the 
' human immunoglobulin are replaced by corresponding non-human residues. Humanized 

20 antibodies can also comprise residues which are found neither in the recipient antibody nor 
in the imported CDR or framework sequences. In general, the humanized antibody will 
comprise substantially all of at least one, and typically two, variable domains, in which all 
or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 

25 immunoglobulin consensus sequence. The humanized antibody optimally also will 

comprise at least a portion of an immunoglobulin constant region (Fc), typically that of a 
human immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. 
Struct. Biol., 2:593-596 (1992)). 
Human Antibodies 

30 Fully human antibodies essentially relate to antibody molecules in which the entire 

sequence of both the light chain and the heavy chain, including the CDRs, arise from 
human genes. Such antibodies are termed "human antibodies", or "fully human antibodies" 
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herein. Human monoclonal antibodies can be prepared by i thfctrib^ thtf hMM 

B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV 
hybridoma technique to produce human monoclonal antibodies (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human 
5 monoclonal antibodies may be utilized in the practice of the present invention and may be 
produced by using human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 
2026-2030) or by transforming human B-cells with Epstein Ban* Virus in vitro (see Cole, et 
al., 1985 In: Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 
77-96). 

10 In addition, human antibodies can also be produced using additional techniques, 

including phage display libraries (Hoogenboom and Winter, J. Mol. Biol., 227:381 (1991); 
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 

15 challenge, human antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 
5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. (Biotechnology 10, 
779-783 (1992)); Lonberg et al. (Nature 368 856-859 (1994)); Morrison ( Nature 368, 

20 812-13 (1994)); Fishwild et al,( Nature Biotechnology 14, 845-51 (1996)); Neuberger 
(Nature Biotechnology 14, 826 (1996)); and Lonberg and Huszar (Intern. Rev. Immunol. 
13 65-93 (1995)). 

Human antibodies may additionally be produced using transgenic nonhuman 
animals which are modified so as to produce fully human antibodies rather than the 

25 animal's endogenous antibodies in response to challenge by an antigen. (See PCT 
publication WO94/02602). The endogenous genes encoding the heavy and light 
immunoglobulin chains in the nonhuman host have been incapacitated, and active loci 
encoding human heavy and light chain immunoglobulins are inserted into the host's 
genome. The human genes are incorporated, for example, using yeast artificial 

30 chromosomes containing the requisite human DNA segments. An animal which provides 
all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the 
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Xenomouse™ as disclosed in PCT publications WO 96/33%¥aiW 
animal produces B cells which secrete fully human immunoglobulins. The antibodies can 
be obtained directly from the animal after immunization with an immunogen of interest, as, 
for example, a preparation of a polyclonal antibody, or alternatively from immortalized B 
5 cells derived from the animal, such as hybridomas producing monoclonal antibodies. 

Additionally, the genes encoding the immunoglobulins with human variable regions can be 
recovered and expressed to obtain the antibodies directly, or can be further modified to 
obtain analogs of antibodies such as, for example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, 

10 lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment 
genes from at least one endogenous heavy chain locus in an embryonic stem cell to prevent 
rearrangement of the locus and to prevent formation of a transcript of a rearranged 
immunoglobulin heavy chain locus, the deletion being effected by a targeting vector 

15 containing a gene encoding a selectable marker; and producing from the embryonic stem 
cell a transgenic mouse whose somatic and germ cells contain the gene encoding the 
selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is 
disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression vector that 

20 contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 
culture, introducing an expression vector containing a nucleotide sequence encoding a light 
chain into another mammalian host cell, and fusing the two cells to form a hybrid cell. The 
hybrid cell expresses an antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically 

25 relevant epitope on an immunogen, and a correlative method for selecting an antibody that 
binds immunospecifically to the relevant epitope with high affinity, are disclosed in PCT 
publication WO 99/53049. 

Fab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of 
30 single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. 
Patent No. 4,946,778). In addition, methods can be adapted for the construction of F a b 
expression libraries (see e.g., Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and 
effective identification of monoclonal F a b fragments with the desired specificity for a 
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protein or derivatives, fragments, analogs or homologs thereof. ^tft>bicl^ il fMg^erits-l3iaF 
contain the idiotypes to a protein antigen may be produced by techniques known in the art 
including, but not limited to: (i) an F (a( y) 2 fragment produced by pepsin digestion of an 
antibody molecule; (ii) an F ab fragment generated by reducing the disulfide bridges of an 
5 F ( ab-)2 fragment; (iii) an F ab fragment generated by the treatment of the antibody molecule 
with papain and a reducing agent and (iv) F v fragments. 

Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies 
that have binding specificities for at least two different antigens. In the present case, one of 
10 the binding specificities is for an antigenic protein of the invention. The second binding 
target is any other antigen, and advantageously is a cell-surface protein or receptor or 
receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 

15 immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) 
produce a potential mixture of ten different antibody molecules, of which only one has the 
correct bispecific structure. The purification of the correct molecule is usually 

20 accomplished by affinity chromatography steps. Similar procedures are disclosed in WO 
93/08829, published 13 May 1993, and in Traunecker et al., EMBO J., 10:3655-3659 
(1991). 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 

25 preferably is with an immunoglobulin heavy-chain constant domain, comprising at least 
part of the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain 
constant region (CHI) containing the site necessary for light-chain binding present in at 
least one of the fusions. DNAs encoding the immunoglobulin heavy-chain fusions and, if 
desired, the immunoglobulin light chain, are inserted into separate expression vectors, and 

30 are co-transfected into a suitable host organism. For further details of generating bispecific 
antibodies see, for example, Suresh et al., Methods in Enzymology, 121:210 (1986). 

According to another approach described in WO 96/27011, the interface between a 
pair of antibody molecules can be engineered to maximize the percentage of heterodimers 
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which are recovered Itom recombinant cell culture. The pr&eifed iMbf¥£fefe^orfip®s#s^« 
least a part of the CH3 region of an antibody constant domain. In this method, one or more 
small amino acid side chains from the interface of the first antibody molecule are replaced 
with larger side chains (e.g. tyrosine or tryptophan). Compensatory "cavities" of identical 
5 or similar size to the large side chain(s) are created on the interface of the second antibody 
molecule by replacing large amino acid side chains with smaller ones (e.g. alanine or 
threonine). This provides a mechanism for increasing the yield of the heterodimer over 
other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody 

10 fragments (e.g. F(ab')2 bispecific antibodies). Techniques for generating bispecific 

antibodies from antibody fragments have been described in the literature. For example, 
bispecific antibodies can be prepared using chemical linkage. Brennan et al., Science 
229:81 (1985) describe a procedure wherein intact antibodies are proteolytically cleaved to 
generate F(ab')2 fragments. These fragments are reduced in the presence of the dithiol 

15 complexing agent sodium arsenite to stabilize vicinal dithiols and prevent intermolecular 
disulfide formation. The Fab' fragments generated are then converted to thionitrobenzoate 
(TNB) derivatives. One of the Fab'-TNB derivatives is then reconverted to the Fab' -thiol 
by reduction with mercaptoethylamine and is mixed with an equimolar amount of the other 
Fab'-TNB derivative to form the bispecific antibody. The bispecific antibodies produced 

20 can be used as agents for the selective immobilization of enzymes. 

Additionally, Fab* fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) 
describe the production of a fully humanized bispecific antibody F(ab')2 molecule. Each 
Fab* fragment was separately secreted from E. coli and subjected to directed chemical 

25 coupling in vitro to form the bispecific antibody. The bispecific antibody thus formed was 
able to bind to cells overexpressing the ErbB2 receptor and normal human T cells, as well 
as trigger the lytic activity of human cytotoxic lymphocytes against human breast tumor 
targets. 

Various techniques for making and isolating bispecific antibody fragments directly 
30 from recombinant cell culture have also been described. For example, bispecific antibodies 
have been produced using leucine zippers. Kostelny et al., J. Immunol, 148(5): 1547-1 553 
(1992). The leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' 
portions of two different antibodies by gene fusion. The antibody homodimers were 
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reduced at the hinge region to form monomers and then rI-bydfzed i l<51&Mi!Ene''arlfi8&df 
heterodimers. This method can also be utilized for the production of antibody homodimers. 
The "diabody" technology described by Hollinger et al., Proc. Natl. Acad. Sci. USA 
90:6444-6448 (1993) has provided an alternative mechanism for making bispecific 

5 antibody fragments. The fragments comprise a heavy-chain variable domain (V H ) 
connected to a light-chain variable domain (V L ) by a linker which is too short to allow 
pairing between the two domains on the same chain. Accordingly, the V H and V L domains 
of one fragment are forced to pair with the complementary V L and V H domains of another 
fragment, thereby forming two antigen-binding sites. Another strategy for making 

10 bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 
reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, 
trispecific antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 

15 which originates in the protein antigen of the invention. Alternatively, an anti-antigenic 
arm of an immunoglobulin molecule can be combined with an arm which binds to a 
triggering molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, 
CD28, or B7), or Fc receptors for IgG (FcyR), such as FcyRI (CD64), FcyRH (CD32) and 
FcyRTTI (CD16) so as to focus cellular defense mechanisms to the cell expressing the 

20 particular antigen. Bispecific antibodies can also be used to direct cytotoxic agents to cells 
which express a particular antigen. These antibodies possess an antigen-binding arm and 
an arm which binds a cytotoxic agent or a radionuclide chelator, such as EOTUBE, DPTA, 
DOTA, or TETA. Another bispecific antibody of interest binds the protein antigen 
described herein and further binds tissue factor (TF). 

25 Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention. 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted 
cells (U.S. Patent No. 4,676,980), and for treatment of HTV infection (WO 91/00360; WO 
30 92/200373; EP 03089). It is contemplated that the antibodies can be prepared in vitro using 
known methods in synthetic protein chemistry, including those involving crosslinking 
agents. For example, immunotoxins can be constructed using a disulfide exchange reaction 
or by forming a thioether bond. Examples of suitable reagents for this purpose include 
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iminothiolate and methyl-4-mercaptobutyrimidate and th<3&Mi&l6slbdr^ 
Patent No. 4,676,980. 

Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector 
5 function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region,.thereby allowing 
interchain disulfide bond formation in this region. The homodimeric antibody thus 
generated can have improved internalization capability and/or increased 
complement-mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). 

10 See Caron et a]., J. Exp Med., 176: 1 191-1 195 (1992) and Shopes, J. Immunol., 148: 

2918-2922 (1992). Homodimeric antibodies with enhanced anti-tumor activity can also be 
prepared using heterobifunctional cross-linkers as described in Wolff et al. Cancer • 
Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that has 
dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 

15 See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody 
conjugated to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an 
enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments 

20 thereof), or a radioactive isotope (Le. , a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 
been described above. Enzymatically active toxins and fragments thereof that can be used 
include diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A 
chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, 

25 alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins 
(PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria 
officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, enomycin, and the 
tricothecenes. A variety of radionuclides are available for the production of 
radioconjugated antibodies. Examples include 212 Bi, 531 1, 131 In, 90 Y, and 186 Re. 

30 Conjugates of the antibody and cytotoxic agent are made using a variety of 

bifiinctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) 
propionate (SPDP), iminothiolane (TT), bifiinctional derivatives of imidoesters (such as 
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dimethyl adipimidate HCL), active esten; (such as disuccKB^ 
(such as glutareldehyde), bis-azido compounds (such as bis (p-azidobenzoyl) 
hexanediamine), bis-diazonium derivatives (such as 
bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 
5 2,6-diisocyanate), and bis-active fluorine compounds (such as 

l,5-difluoro~2,4-dinitrobenzene). For example, a ricin immunotoxin can be prepared as 
described in Vitetta et al., Science, 238 : 1098 (1987). Carbon-14-labeled 
l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX-DTPA) is an 
exemplary chelating agent for conjugation of radionucleotide to the antibody. See 

10 WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate 
is administered to the patient, followed by removal of unbound conjugate from the 
circulation using a clearing agent and then administration of a "ligand" (e.g., avidin) that is 

15 in turn conjugated to a cytotoxic agent. 

Immunoliposomes 

The antibodies disclosed herein can also be formulated as immunoliposomes. 
Liposomes containing the antibody are prepared by methods known in the art, such as 
described in Epstein et al., Proc. Natl. Acad. Sci. USA, 82: 3688 (1985); Hwang et al., 

20 Proc. Nad Acad. Sci. USA, 77: 4030 (1980); and U.S, Pat. Nos. 4,485,045 and 4,544,545. 
Liposomes with enhanced circulation time are disclosed in U.S. Patent No. 5,013,556. 

Particularly useful liposomes can be generated by the reverse-phase evaporation 
method with a lipid composition comprising phosphatidylcholine, cholesterol, and 
PEG-derivatized phosphatidylethanolamine (PEG-PE). Liposomes are extruded through 

25 filters of defined pore size to yield liposomes with the desired diameter. Fab' fragments of 
the antibody of the present invention can be conjugated to the liposomes as described in 
Martin et ak,_J. Biol. Chem., 257: 286-288 (1982) via a disulfide-interchange reaction. A 
chemotherapeutic agent (such as Doxorubicin) is optionally contained within the liposome. 
See Gabizon et al, J. National Cancer Inst, 81(19): 1484 (1989). 

30 Diagnostic Applications of Antibodies Directed Against the Proteins of the 

Invention 

In one embodiment, methods for the screening of antibodies that possess the desired 
specificity include, but are not limited to, enzyme linked immunosorbent assay (ELISA) 
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and other immunologically mediated techniques known vffifffift fh£ aft! Wdf Specific - 
embodiment, selection of antibodies that are specific to a particular domain of an NOVX 
protein is facilitated by generation of hybridomas that bind to the fragment of an NOVX 
protein possessing such a domain. Thus, antibodies that are specific for a desired domain 
5 within an NOVX protein, or derivatives, fragments, analogs or homologs thereof, are also 
provided herein. 

Antibodies directed against a NOVX protein of the invention may be used in 
methods known within the art relating to the localization and/or quantitation of a NOVX 
protein (e.g., for use in measuring levels of the NOVX protein within appropriate 
10 physiological samples, for use in diagnostic methods, for use in imaging the protein, and 
the like). In a given embodiment, antibodies specific to a NOVX protein, or derivative, 
fragment, analog or homolog thereof, that contain the antibody derived antigen binding 
domain, arc utilized as pharmacologically active compounds (referred to hereinafter as 
"Therapeutics"). 

15 An antibody specific for a NOVX protein of the invention (e.g., a monoclonal 

antibody or a polyclonal antibody) can be used to isolate a NOVX polypeptide by standard 
techniques, such as immunoaffinity, chromatography or immunoprecipitation. An antibody 
to a NOVX polypeptide can facilitate, the purification of a natural NOVX antigen from 
cells, or of a recombinantly produced NOVX antigen expressed in host cells. Moreover, 

20 such an anti-NOVX antibody can be used to detect the antigenic NOVX protein (e.g., in a 
cellular lysate or cell supernatant) in order to evaluate the abundance and pattern of 
expression of the antigenic NOVX protein. Antibodies directed against a NOVX protein 
can be used diagnostically to monitor protein levels in tissue as part of a clinical testing 
procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. 

25 Detection can be facilitated by coupling (i.e., physically linking) the antibody to a 
detectable substance. Examples of detectable substances include various enzymes, 
prosthetic groups, fluorescent materials, luminescent materials, bioluminescent materials, 
and radioactive materials. Examples of suitable enzymes include horseradish peroxidase, 
alkaline phosphatase, |3-galactosidase, or acetylcholinesterase; examples of suitable 

30 prosthetic group complexes include streptavidin/biotin and avidin/biotin; examples of 
suitable fluorescent materials include umbelliferone, fluorescein, fluorescein 
isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or 
phycoerythrin; an example of a luminescent material includes luminol; examples of 
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bioluminescent materials include luciferase, luciferin, aniTafequbifn^a^^lesW 
suitable radioactive material include 125 1, 131 1, 35 S or 3 H. 

Antibody Therapeutics 

Antibodies of the invention, including polyclonal, monoclonal, humanized and fully 
5 human antibodies, may used as therapeutic agents. Such agents will generally be employed 
to treat or prevent a disease or pathology in a subject. An antibody preparation, preferably 
one having high specificity and high affinity for its target antigen, is administered to the 
subject and will generally have an effect due to its binding with the target. Such an effect 
may be one of two kinds, depending on the specific nature of the interaction between the 

10 given antibody molecule and the target antigen in question. In the first instance, 

administration of the antibody may abrogate or inhibit the binding of the target with an 
endogenous ligand to which it naturally binds. In this case, the antibody binds to the target 
and masks a binding site of the naturally occurring ligand, wherein the ligand serves as an 
effector molecule. Thus the receptor mediates a signal transduction pathway for which 

15 ligand is responsible. 

Alternatively, the effect may be one in which the antibody elicits a physiological 
result by virtue of binding to an effector binding site on the target molecule. In this case 
the target, a receptor having an endogenous ligand which may be absent or defective in the 
disease or pathology, binds the antibody as a surrogate effector ligand, initiating a 

20 receptor-based signal transduction event by the receptor. 

A therapeutically effective amount of an antibody of the invention relates generally 
to the amount needed to achieve a therapeutic objective. As noted above, this may be a 
binding interaction between the antibody and its target antigen that, in certain cases, 
interferes with the functioning of the target, and in other cases, promotes a physiological 

25 response. The amount required to be administered will furthermore depend on the binding 
affinity of the antibody for its specific antigen, and will also depend on the rate at which an 
administered antibody is depleted from the free volume other subject to which it is 
administered. Common ranges for therapeutically effective dosing of an antibody or 
antibody fragment of the invention may be, by way of nonlimiting example, from about 0.1 

30 mg/kg body weight to about 50 mg/kg body weight. Common dosing frequencies may 
range, for example, from twice daily to once a week. 
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Pharmaceutical Compositions of Antibodies 

Antibodies specifically binding a protein of the invention, as well as other 
molecules identified by the screening assays disclosed herein, can be administered for the 
treatment of various disorders in the form of pharmaceutical compositions. Principles and 
5 considerations involved in preparing such compositions, as well as guidance in the choice 
of components are provided, for example, in Remington : The Science And Practice Of 
Pharmacy 19th ed. (Alfonso R. Gennaro, et al., editors) Mack Pub. Co., Easton, Pa. : 1995; 
Drug Absorption Enhancement : Concepts, Possibilities, Limitations, And Trends, 
Harwood Academic Publishers, Langhorne, Pa., 1994; and Peptide And Protein Drug 

10 Delivery (Advances In Parenteral Sciences, Vol. 4), 1991, M. Dekker, New York. 

If the antigenic protein is intracellular and whole antibodies are used as inhibitors, 
internalizing antibodies are preferred. However, liposomes can also be used to deliver the 
antibody, or an antibody fragment, into cells. Where antibody fragments are used, the 
smallest inhibitory fragment that specifically binds to the binding domain of the target 

15 protein is preferred. For example, based upon the variable-region sequences of an 

antibody, peptide molecules can be designed that retain the ability to bind the target protein 
sequence. Such peptides can be synthesized chemically and/or produced by recombinant 
DNA technology. See, e.g., Marasco et al., Proc. Natl. Acad. Sci. USA, 90: 7889-7893 
(1993). The formulation herein can also contain more than one active compound as 

20 necessary for the particular indication being treated, preferably those with complementary 
activities that do not adversely affect each other. Alternatively, or in addition, the 
composition can comprise an agent that enhances its function, such as, for example, a 
cytotoxic agent, cytokine, chemotherapeutic agent, or growth-inhibitory agent. Such 
molecules are suitably present in combination in amounts that are effective for the purpose 

25 intended. 

The active ingredients can also be entrapped in microcapsules prepared, for 
example, by coacervation techniques or by interf acial polymerization, for example, 
hydroxymethylcellulose or gelatin-microcapsules and polymethylmethacrylate) 
microcapsules, respectively, in colloidal drug delivery systems (for example, liposomes, 
30 albumin microspheres, microemulsions, nano-particles, and nanocapsules) or in 
macroemulsions. 

The formulations to be used for in vivo administration must be sterile. This is 
readily accomplished by filtration through sterile filtration membranes. 
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Sustained-release preparations can be prepared. SGifeblS ^xaofipfe^df 
sustained-release preparations include semipermeable matrices of solid hydrophobic 
polymers containing the antibody, which matrices are in the form of shaped articles, e.g., 
films, or microcapsules. Examples of sustained-release matrices include polyesters, 
5 hydrogels (for example, poly(2-hydroxyethyl-methacrylate), or poly(vinylalcohol)), 
polylactides (U.S. Pat. No. 3,773,919), copolymers of L-glutamic acid and y 
ethyl-L-glutamate, non-degradable ethylene-vinyl acetate, degradable lactic acid-glycolic 
acid copolymers such as the LUPRON DEPOT ™ (injectable microspheres composed of 
lactic acid-glycolic acid copolymer and leuprolide acetate), and 
10 poly-D-(->3-hydroxybutyric acid. While polymers such as ethylene-vinyl acetate and 
lactic acid-glycolic acid enable release of molecules for over 100 days, certain hydrogels 
release proteins for shorter time periods. 

ELISA Assay 

An agent for detecting an analyte protein is an antibody capable of binding to an 

15 analyte protein, preferably an antibody with a detectable label. Antibodies can be 

polyclonal, or more preferably, monoclonal. An intact antibody, or a fragment thereof 
{e.g., Fab or F (ab )2) can be used. The term "labeled", with regard to the probe or antibody, is 
intended to encompass direct labeling of the probe or antibody by coupling physically 
linking) a detectable substance to the probe or antibody, as well as indirect labeling of the 

20 probe or antibody by reactivity with another reagent that is directly labeled. Examples of 
indirect labeling include detection of a primary antibody using a fluorescently-labeled 
secondary antibody and end-labeling of a DNA probe with biotin such that it can be 
detected with fluorescently-labeled streptavidin. The term "biological sample" is intended 
to include tissues, cells and biological fluids isolated from a subject, as well as tissues, cells 

25 and fluids present within a subject. Included within the usage of the term "biological 

sample", therefore, is blood and a fraction or component of blood including blood serum, 
blood plasma, or lymph. That is, the detection method of the invention can be used to 
detect an analyte mRNA, protein, or genomic DNA in a biological sample in vitro as well 
as in vivo. For example, in vitro techniques for detection of an analyte mRNA include 

30 Northern hybridizations and in situ hybridizations. In vitro techniques for detection of an 
analyte protein include enzyme linked immunosorbent assays (ELIS As), Western blots, 
immunoprecipitations, and immunofluorescence. In vitro techniques for detection of an 
analyte genomic DNA include Southern hybridizations. Procedures for conducting 
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immunoassays are described, for example in "ELIS A: Th b eo?y aUPrac^ce: ^etHS&s^ff 
Molecular Biology", Vol. 42, J. R. Crowther (Ed.) Human Press, Totowa, NJ, 1995; 
"Immunoassay", E. Diamandis and T. Christopoulus, Academic Press, Inc., San Diego, 
CA, 1996; and "Practice and Thory of Enzyme Immunoassays", P. Tijssen, Elsevier 
5 Science Publishers, Amsterdam, 1985. Furthermore, in vivo techniques for detection of an 
analyte protein include introducing into a subject a labeled anti-an analyte protein antibody. 
For example, the antibody can be labeled with a radioactive marker whose presence and 
location in a subject can be detected by standard imaging techniques. 

NOVX Recombinant Expression Vectors and Host Cells 

10 Another aspect of the invention pertains to vectors, preferably expression vectors, 

containing a nucleic acid encoding a NOVX protein, or derivatives, fragments, analogs or 
homologs thereof. As used herein, the term "vector" refers to a nucleic acid molecule 
capable of transporting another nucleic acid to which it has been linked. One type of vector 
is a "plasmid", which refers to a circular double stranded DNA loop into which additional 

15 DNA segments can be ligated. Another type of vector is a viral vector, wherein additional 
DNA segments can be ligated into the viral genome. Certain vectors are capable of 
autonomous replication in a host cell into which they are introduced (e.g., bacterial vectors 
having a bacterial origin of replication and episomal mammalian vectors). Other vectors 
(e.g., non-episomal mammalian vectors) are integrated into the genome of a host cell upon 

20 introduction into the host cell, and thereby are replicated along with the host genome. 

Moreover, certain vectors are capable of directing the expression of genes to which they are 
operatively-linked. Such vectors are referred to herein as "expression vectors". In general, 
expression vectors of utility in recombinant DNA techniques are often in the form of 
plasmids. In the present specification, "plasmid" and "vector" can be used interchangeably 

25 as the plasmid is the most commonly used form of vector. However, the invention is 
intended to include such other forms of expression vectors, such as viral vectors (e.g., 
replication defective retroviruses, adenoviruses and adeno-associated viruses), which serve 
equivalent functions. 

The recombinant expression vectors of the invention comprise a nucleic acid of the 
30 invention in a form suitable for expression of the nucleic acid in a host cell, which means 
that the recombinant expression vectors include one or more regulatory sequences, selected 
on the basis of the host cells to be used for expression, that is operatively-linked to the 
nucleic acid sequence to be expressed. Within a recombinant expression vector, 
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"operably-linked" is intended to mean that the nucleotide tt se^\i6e bf BM^SE 1s vMaW 
the regulatory sequence(s) in a manner that allows for expression of the nucleotide 
sequence (e.g., in an in vitro transcription/translation system or in a host cell when the 
vector is introduced into the host cell). 
5 The term "regulatory sequence" is intended to includes promoters, enhancers and 

other expression control elements (e.g., polyadenylation signals). Such regulatory 
sequences are described, for example, in Goeddel, Gene Expression Technology: 
Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Regulatory 
sequences include those that direct constitutive expression of a nucleotide sequence in 

10 many types of host cell and those that direct expression of the nucleotide sequence only in 
certain host cells (e.g., tissue-specific regulatory sequences). It will be appreciated by 
those skilled in the art that the design of the expression vector can depend on such factors 
as the choice of the host cell to be transformed, the level of expression of protein desired, 
etc. The expression vectors of the invention can be introduced into host cells to thereby 

15 produce proteins or peptides, including fusion proteins or peptides, encoded by nucleic 
acids as described herein (e.g., NOVX proteins, mutant forms of NOVX proteins, fusion 
proteins, etc.). 

The recombinant expression vectors of the invention can be designed for expression 
of NOVX proteins in prokaryotic or eukaryotic cells. For example, NOVX proteins can be 

20 expressed in bacterial cells such as Escherichia coli, insect cells (using baculovirus 

expression vectors) yeast cells or mammalian cells. Suitable host cells are discussed further 
in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic 
Press, San Diego, Calif. (1990). Alternatively, the recombinant expression vector can be 
transcribed and translated in vitro, for example using 17 promoter regulatory sequences 

25 and T7 polymerase. 

Expression of proteins in prokaryotes is most often earned out in Escherichia coli 
with vectors containing constitutive or inducible promoters directing the expression of 
either fusion or non-fusion proteins. Fusion vectors add a number of amino acids to a 
protein encoded therein, usually to the amino terminus of the recombinant protein. Such 

30 fusion vectors typically serve three purposes: (i) to increase expression of recombinant 
protein; (it) to increase the solubility of the recombinant protein; and (Hi) to aid in the 
purification of the recombinant protein by acting as a ligand in affinity purification. Often, 
in fusion expression vectors, a proteolytic cleavage site is introduced at the junction of the 
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fusion moiety and the recombinant protein to enable sepdFafi&rfof tM yfeWifiWnaHt pfOtSifi 
from the fusion moiety subsequent to purification of the fusion protein. Such enzymes, and 
their cognate recognition sequences, include Factor Xa, thrombin and enterokinase. 
Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith and 
5 Johnson, 1988. Gene 67: 31-40), pMAL (New England Biolabs, Beverly, Mass.) and 

pRTTS (Pharmacia, Piscataway, N J.) that fuse glutathione S-transferase (GST), maltose E 
binding protein, or protein A, respectively, to the target recombinant protein. 

Examples of suitable inducible non-fusion E. coli expression vectors include pTrc 
(Amrann et al, (1988) Gene 69:301-315) and pET lid (Studier et al, Gene Expression 
10 Technology: Methods in Enzymology 1 85, Academic Press, San Diego, Calif. (1990) 
60-89). 

One strategy to maximize recombinant protein expression in E. coli is to express the 
protein in a host bacteria with an impaired capacity to proteolytically cleave the 
recombinant protein. See, e.g., Gottesman, Gene Expression Technology: Methods in 

15 Enzymology 185, Academic Press, San Diego, Calif. (1990) 1 19-128. Another strategy is 
to alter the nucleic acid sequence of the nucleic acid to be inserted into an expression vector 
so that the individual codons for each amino acid are those preferentially utilized in E. coli 
{see, e.g., Wada, et al, 1992. NucL Acids Res. 20: 2111-2118). Such alteration of nucleic 
acid sequences of the invention can be carried out by standard DNA synthesis techniques. 

20 In another embodiment, the NOVX expression vector is a yeast expression vector. 

Examples of vectors for expression in yeast Saccharomyces cerivisae include pYepSecl 
(Baldari, et al., 1987. EMBO J. 6: 229-234), pMFa (Kurjan and Herskowitz, 1982. Cell 30: 
933-943), pJRY88 (Schultz et al, 1987. Gene 54: 113-123), pYES2 (Invitrogen 
Corporation, San Diego, Calif.), and picZ (LiVitrogen Corp, San Diego, Calif.). 

25 Alternatively, NOVX can be expressed in insect cells using baculovirus expression 

vectors. Baculovirus vectors available for expression of proteins in cultured insect cells 
{e.g., SF9 cells) include the pAc series (Smith, et al, 1983. Mol Cell. Biol. 3: 2156-2165) 
and the pVL series (Lucklow and Summers, 1989. Virology 170: 31-39). 

In yet another embodiment, a nucleic acid of the invention is- expressed in 

30 mammalian cells using a mammalian expression vector. Examples of mammalian 

expression vectors include pCDM8 (Seed, 1987. Nature 329: 840) andpMT2PC (Kaufman, 
et al, 1987. EMBO /. 6: 187 T 195). When used in mammalian cells, the expression vector's 
control functions are often provided by viral regulatory elements. For example, commonly 
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used promoters are derived from polyoma, adenovirus 2, cyfomegalovifusT and simiSnlifus 
40. For other suitable expression systems for both prokaryotic and eukaryotic cells see, 
e.g., Chapters 16 and 17 of Sambrook, et al, Molecular Cloning: A Laboratory 
MANUAL. 2nd ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, 
5 Cold Spring Harbor, N.Y., 1989. 

In another embodiment, the recombinant mammalian expression vector is capable 
of directing expression of the nucleic acid preferentially in a particular cell type (e.g., 
tissue-specific regulatory elements are used to express the nucleic acid). Tissue-specific 
regulatory elements are known in the art. Non-limiting examples of suitable tissue-specific 

10 promoters include the albumin promoter (liver-specific; Pinkert, et al., 1987. Genes Dev. 1: 
268-277), lymphoid-specific promoters (Calame and Eaton, 1988. Adv. Immunol. 43: 
235-275), in particular promoters of T cell receptors (Winoto and Baltimore, 1989. EMBO 
J. 8: 729-733) and immunoglobulins (Banerji, et al, 1983. Cell 33: 729-740; Queen and 
Baltimore, 1983. Cell 33: 741-748), neuron-specific promoters (e.g., the neurofilament 

15 promoter; Byrne and Ruddle, 1989. Proc. Natl Acad. Set USA 86: 5473-5477), 

pancreas-specific promoters (Edlund, et al, 1985. Science 230: 912-916), and mammary 
gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and European 
Application Publication No. 264,166). Developmentally-regulated promoters are also 
encompassed, e.g., the murine hox promoters (Kessel and Gruss, 1990. Science 249: 

20 374-379) and the a-fetoprotein promoter (Campes and Tilghman, 1989. Genes Dev. 3: 
537-546). 

The invention further provides a recombinant expression vector comprising a DNA 
molecule of the invention cloned into the expression vector in an antisense orientation. 
That is, the DNA molecule is operatively-linked to a regulatory sequence in a manner that 

25 allows for expression (by transcription of the DNA molecule) of an RNA molecule that is 
antisense to NOVX mRNA. Regulatory sequences operatively linked to a nucleic acid 
cloned in the antisense orientation can be chosen that direct the continuous expression of 
the antisense RNA molecule in a variety of cell types, for instance viral promoters and/or 
enhancers, or regulatory sequences can be chosen that direct constitutive, tissue specific or 

30 cell type specific expression of antisense RNA. The antisense expression vector can be in 
the form of a recombinant plasmid, phagemid or attenuated virus in which antisense nucleic 
acids are produced under the control of a high efficiency regulatory region, the activity of 
which can be determined by the cell type into which the vector is introduced. For a 
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discussion of the regulation of gene expression using ant^Mo tt Qsnif^^^^^WS^csSff 9 
et aL y "Antisense RNA as a molecular tool for genetic analysis," Reviews-Trends in 
Genetics, Vol. 1(1)1986. 

Another aspect of the invention pertains to host cells into which a recombinant 
5 expression vector of the invention has been introduced. The terms "host cell" and 

"recombinant host cell" are used interchangeably herein. It is understood that such terms 
refer not only to the particular subject cell but also to the progeny or potential progeny of 
such a cell. Because certain modifications may occur in succeeding generations due to 
either mutation or environmental influences, such progeny may not, in fact, be identical to 

10 the parent cell, but are still included within the scope of the term as used herein. 

A host cell can be any prokaryotic or eukaryotic cell. For example, NOVX protein 
can be expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells 
(such as Chinese hamster ovary cells (CHO) or COS cells). Other suitable host cells are 
known to those skilled in the art. 

15 Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional 

transformation or transfection techniques. As used herein, the terms "transformation" and 
"transfection" are intended to refer to a variety of art-recognized techniques for introducing 
foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium 
chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or 

20 electroporation. Suitable methods for transforming or transfecting host cells can be found 
in Sambrook, et al. (Molecular Cloning: A Laboratory Manual. 2nd ed., Cold 
Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
N.Y., 1989), and other laboratory manuals. 

For stable transfection of mammalian cells, it is known that, depending upon the 

25 expression vector and transfection technique used, only a small fraction of cells may 
integrate the foreign DNA into their genome. In order to identify and select these 
integrants, a gene that encodes a selectable marker (e.g., resistance to antibiotics) is 
generally introduced into the host cells along with the gene of interest. Various selectable 
markers include those that confer resistance to drugs, such as G418, hygromycin and 

30 methotrexate. Nucleic acid encoding a selectable marker can be introduced into a host cell 
on the same vector as that encoding NOVX or can be introduced on a separate vector. 
Cells stably transfected with the introduced nucleic acid can be identified by drug selection 
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(e.g., cells that have incorporated the selectable marker genewiirsi^iYe**wKile ffieotBBr 
cells die). 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, 
can be used to produce express) NOVX protein. Accordingly, the invention further 
5 provides methods for producing NOVX protein using the host cells of the invention. In one 
embodiment, the method comprises culturing the host cell of invention (into which a 
recombinant expression vector encoding NOVX protein has been introduced) in a suitable 
medium such that NOVX protein is produced. In another embodiment, the method further 
comprises isolating NOVX protein from the medium or the host cell. 

10 Transgenic NOVX Animals 

The host cells of the invention can also be used to produce non-human transgenic 
animals. For example, in one embodiment, a host cell of the invention is a fertilized oocyte 
or an embryonic stem cell into which NOVX protein-coding sequences have been 
introduced. Such host cells can then be used to create non-human transgenic animals in 

15 which exogenous NOVX sequences have been introduced into their genome or 

homologous recombinant animals in which endogenous NOVX sequences have been 
altered. Such animals are useful for studying the function and/or activity of NOVX protein 
and for identifying and/or evaluating modulators of NOVX protein activity. As used 
herein, a "transgenic animal" is a non-human animal, preferably a mammal, more 

20 preferably a rodent such as a rat or mouse, in which one or more of the cells of the animal 
includes a transgene. Other examples of transgenic animals include non-human primates, 
sheep, dogs, cows, goats, chickens, amphibians, etc. A transgene is exogenous DNA that is 
integrated into the genome of a cell from which a transgenic animal develops and that 
remains in the genome of the mature animal, thereby directing the expression of an 

25 encoded gene product in one or more cell types or tissues of the transgenic animal. As used 
herein, a "homologous recombinant animal" is a non-human animal, preferably a mammal, 
more preferably a mouse, in which an endogenous NOVX gene has been altered by 
homologous recombination between the endogenous gene and an exogenous DNA 
molecule introduced into a cell of the animal, e.g., an embryonic cell of the animal, prior to 

30 development of the animal. 

A transgenic animal of the invention can be created by introducing 
NOVX-encoding nucleic acid into the male pronuclei of a fertilized oocyte (e.g., by 
microinjection, retroviral infection) and allowing the oocyte to develop in a pseudopregnant 
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female foster animal. The human NOVX cDNA sequen5es7?.^-/anydneof SEQHOCT 
NO:2/i-l, wherein n is an integer between 1 and 124, can be introduced as a transgene into 
the genome of a non-human animal. Alternatively, a non-human homologue of the human 
NOVX gene, such as a mouse NOVX gene, can be isolated based on hybridization to the 
5 human NOVX cDNA (described further supra) and used as a transgene. Intronic 

sequences and polyadenylation signals can also be included in the transgene to increase the 
efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) can be 
operably-linked to the NOVX transgene to direct expression of NOVX protein to particular 
cells. Methods for generating transgenic animals via embryo manipulation and 

10 microinjection, particularly animals such as mice, have become conventional in the art and 
are described, for example, in U.S. Patent Nos. 4,736,866; 4,870,009; and 4,873,191; and 
Hogan, 1986. In: Manipulating the MOUSE Embryo, Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, N.Y. Similar methods are used for production of other 
transgenic animals. A transgenic founder animal can be identified based upon the presence 

15 of the NOVX transgene in its genome and/or expression of NOVX mRNA in tissues or 
cells of the animals. A transgenic founder animal can then be used to breed additional 
animals carrying the transgene. Moreover, transgenic animals carrying a 
transgene-encoding NOVX protein can further be bred to other transgenic animals carrying 
other transgenes. 

20 To create a homologous recombinant animal, a vector is prepared which contains at 

least a portion of a NOVX gene into which a deletion, addition or substitution has been 
introduced to thereby alter, e.g., functionally disrupt, the NOVX gene. The NOVX gene 
can be a human gene (e.g., the cDNA of any one of SEQ ID NO:2rc-l, wherein n is an 
integer between 1 and 124), but more preferably, is a non-human homologue of a human 

25 NOVX gene. For example, a mouse homologue of human NOVX gene of SEQ ID 
NO:2n-l, wherein n is an integer between 1 and 124, can be used to construct a 
homologous recombination vector suitable for altering an endogenous NOVX gene in the 
mouse genome. In one embodiment, the vector is designed such that, upon homologous 
recombination, the endogenous NOVX gene is functionally disrupted (i.e., no longer . 

30 encodes a functional protein; also referred to as a "knock out" vector). 

Alternatively, the vector can be designed such that, upon homologous 
recombination, the endogenous NO VX gene is mutated or otherwise altered but still 
encodes functional protein (e.g., the upstream regulatory region can be altered to thereby 
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alter the expression of the endogenous NOVX protein). In tEe homologous recombination * 
vector, the altered portion of the NOVX gene is flanked at its 5 - and 3-termini by 
additional nucleic acid of the NOVX gene to allow for homologous recombination to occur 
between the exogenous NOVX gene carried by the vector and an endogenous NOVX gene 
5 in an embryonic stem cell. The additional flanking NOVX nucleic acid is of sufficient 
length for successful homologous recombination with the endogenous gene. Typically, 
several kilobases of flanking DNA (both at the 5 - and 3 -termini) are included in the 
vector. See, e.g., Thomas, et al, 1987. Cell 51: 503 for a description of homologous 
recombination vectors. The vector is ten introduced into an embryonic stem cell line (e.g., 

10 by electroporation) and cells in which the introduced NOVX gene has 

homologously-recombined with the endogenous NOVX gene are selected. See, e.g., Li, et 
al, 1992. Cell 69: 915. 

The selected cells are then injected into a blastocyst of an animal (e.g., a mouse) to 
form aggregation chimeras. See, e.g., Bradley, 1987. In: Teratocarcinomas AND 

15 Embryonic Stem Cells: A Practical Approach, Robertson, ed. IRL, Oxford, pp. 

1 13-152. A chimeric embryo can then be implanted into a suitable pseudopregnant female 
foster animal and the embryo brought to term. Progeny harboring the 
homologously-recombined DNA in their germ cells can be used to breed animals in which 
all cells of the animal contain the homologously-recombined DNA by germline 

20 transmission of the transgene. Methods for constructing homologous recombination 

vectors and homologous recombinant animals are described further in Bradley, 1991. Curr. 
Opin. Biotechnol 2: 823-829; PCT International Publication Nos.: WO 90/11354; WO 
91/01140; WO 92/0968; and WO 93/04169. 

In another embodiment, transgenic non-humans animals can be produced that 

25 contain selected systems that allow for regulated expression of the transgene. One example 
of such a system is the cre/loxP recombinase system of bacteriophage PI. For a description 
of the cre/loxP recombinase system, See, e.g., Lakso, etal, 1992. Proc. Natl. Acad. Set 
USA 89: 6232-6236. Another example of a recombinase system is the FLP recombinase 
system of Saccharomyces cerevisiae. See, O'Gorman, et ah, 1991. Science 251:1351-1355. 

30 If a cre/loxP recombinase system is used to regulate expression of the transgene, animals 
containing transgenes encoding both the Cre recombinase and a selected protein are 
required. Such animals can be provided through the construction of "double" transgenic 
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animals, e.g., by mating two transgenic animals, one containing a tr^s^ene'ericoclirig aT 
selected protein and the other containing a transgene encoding a recombinase. 

Clones of the non-human transgenic animals described herein can also be produced 
according to the methods described in Wilmut, et al, 1997. Nature 385: 810-813. In brief, 
5 a cell (e.g., a somatic cell) from the transgenic animal can be isolated and induced to exit 
the growth cycle and enter Go phase. The quiescent cell can then be fused, e.g., through the 
use of electrical pulses, to an enucleated oocyte from an animal of the same species from 
which the quiescent cell is isolated. The reconstructed oocyte is then cultured such that it 
develops to morula or blastocyte and then transferred to pseudopregnant female foster 
10 animal. The offspring borne of this female foster animal will be a clone of the animal from 
which the cell (e.g., the somatic cell) is isolated. 

Pharmaceutical Compositions 

The NOVX nucleic acid molecules, NOVX proteins, and anti-NOVX antibodies 
(also referred to herein as "active compounds") of the invention, and derivatives, fragments, 

15 analogs and homologs thereof, can be incorporated into pharmaceutical compositions 
suitable for administration. Such compositions typically comprise the nucleic acid 
molecule, protein, or antibody and a pharmaceutical^ acceptable carrier. As used herein, 
"pharmaceutical^ acceptable carrier" is intended to include any and all solvents, dispersion 
media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying 

20 agents, and the like, compatible with pharmaceutical administration. Suitable carriers are 
described in the most recent edition of Remington's Pharmaceutical Sciences, a standard 
reference text in the field, which is incorporated herein by reference. Preferred examples of 
such carriers or diluents include, but are not limited to, water, saline, finger's solutions, 
dextrose solution, and 5% human serum albumin. Liposomes and non-aqueous vehicles 

25 such as fixed oils may also be used. The use of such media and agents for 

pharmaceutical^ active substances is well known in the art. Except insofar as any 
conventional media or agent is incompatible with the active compound, use thereof in the 
compositions is contemplated. Supplementary active compounds can also be incorporated 
into the compositions. 

30 A pharmaceutical composition of the invention is formulated to be compatible with 

its intended route of administration. Examples of routes of administration include 
parenteral, e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation), transdermal 
(i.e., topical), transmucosal, and rectal administration. Solutions or suspensions used for 
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parenteral, intradermal, or subcutaneous application can include fhe folTowing'compo ' 
a sterile diluent such as water for injection, saline solution, fixed oils, polyethylene glycols, 
glycerine, propylene glycol or other synthetic solvents; antibacterial agents such as benzyl 
alcohol or methyl parabens; antioxidants such as ascorbic acid or sodium bisulfite; 
5 chelating agents such as ethylenediaminetetraacetic acid (EDTA); buffers such as acetates, 
citrates or phosphates, and agents for the adjustment of tonicity such as sodium chloride or 
dextrose. The pH can be adjusted with acids or bases, such as hydrochloric acid or sodium 
hydroxide. The parenteral preparation can be enclosed in ampoules, disposable syringes or 
multiple dose vials made of glass or plastic. 

10 Pharmaceutical compositions suitable for injectable use include sterile aqueous 

solutions (where water soluble) or dispersions and sterile powders for the extemporaneous 
preparation of sterile injectable solutions or dispersion. For intravenous administration, 
suitable carriers include physiological saline, bacteriostatic water, Cremophor EL™ (BASF, 
Parsippany, NJ.) or phosphate buffered saline (PBS). In all cases, the composition must be 

15 sterile and should be fluid to the extent that easy syringeability exists. It must be stable 
under the conditions of manufacture and storage and must be preserved against the 
contaminating action of microorganisms such as bacteria and fungi. The carrier can be a 
solvent or dispersion medium containing, for example, water, ethanol, polyol (for example, 
glycerol, propylene glycol, and liquid polyethylene glycol, and the like), and suitable 

20 mixtures thereof. The proper fluidity can be maintained, for example, by the use of a 
coating such as lecithin, by the maintenance of the required particle size in the case of 
dispersion and by the use of surfactants. Prevention of the action of microorganisms can be 
achieved by various antibacterial and antifungal agents, for example, parabens, 
chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be 

25 preferable to include isotonic agents, for example, sugars, polyalcohols such as manitol, 
sorbitol, sodium chloride in the composition. Prolonged absorption of the injectable 
compositions can be brought about by including in the composition an agent which delays 
. absorption, for example, aluminum monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active compound 

30 (e.g., a NOVX protein or anti-NOVX antibody) in the required amount in an appropriate 
solvent with one or a combination of ingredients enumerated above, as required, followed 
by filtered sterilization. Generally, dispersions are prepared by incorporating the active 
compound into a sterile vehicle that contains a basic dispersion medium and the required 
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other ingredients from those enumerated above. In the case of sterill^^ l ^ere%f13S , * 
preparation of sterile injectable solutions, methods of preparation are vacuum drying and 
freeze-drying that yields a powder of the active ingredient plus any additional desired 
ingredient from a previously sterile-filtered solution thereof. 
5 Oral compositions generally include an inert diluent or an edible carrier. They can 

be enclosed in gelatin capsules or compressed into tablets. For the purpose of oral 
therapeutic administration, the active compound can be incorporated with excipients and 
used in the form of tablets, troches, or capsules. Oral compositions can also be prepared 
using a fluid carrier for use as a mouthwash, wherein the compound in the fluid carrier is 

10 applied orally and swished and expectorated or swallowed. Pharmaceutically compatible 
binding agents, and/or adjuvant materials can be included as part of the composition. The 
tablets, pills, capsules, troches and the like can contain any of the following ingredients, or 
compounds of a similar nature: a binder such as microcrystalline cellulose, gum tragacanth 
or gelatin; an excipient such as starch or lactose, a disintegrating agent such as alginic acid, 

15 Primogel, or corn starch; a lubricant such as magnesium stearate or Sterotes; a glidant such 
as colloidal silicon dioxide; a sweetening agent such as sucrose or saccharin; or a flavoring 
agent such as peppermint, methyl salicylate, Or orange flavoring. 

For administration by inhalation, the compounds are delivered in the form of an 
aerosol spray from pressured container or dispenser which contains a suitable propellant, 

20 e.g. , a gas such as carbon dioxide, or a nebulizer. 

Systemic administration can also be by transmucosal or transdermal means. For 
transmucosal or transdermal administration, penetrants appropriate to the barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art, and 
include, for example, for transmucosal administration, detergents, bile salts, and fusidic 

25 acid derivatives. Transmucosal administration can be accomplished through the use of 
nasal sprays or suppositories. For transdermal administration, the active compounds are 
formulated into ointments, salves, gels, or creams as generally known in the art. 

The compounds can also be prepared in the form of suppositories (e.g., with 
conventional suppository bases such as cocoa butter and other glycerides) or retention 

30 enemas for rectal delivery. 

In one embodiment, the active compounds are prepared with carriers that will 
protect the compound against rapid elimination from the body, such as a controlled release 
formulation, including implants and microencapsulated delivery systems. Biodegradable, 
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biocompatible polymers can be used, such as ethylene vinylTcetaterpdlyMl^ariaesT 
polyglycolic acid, collagen, polyorthoesters, and polylactic acid. Methods for preparation 
of such formulations will be apparent to those skilled in the art. The materials can also be 
obtained commercially from Alza Corporation and Nova Pharmaceuticals, Inc. Liposomal 
5 suspensions (including liposomes targeted to infected cells with monoclonal antibodies to 
viral antigens) can also be used as pharmaceutically acceptable carriers. These can be 
prepared according to methods known to those skilled in the art, for example, as described 
in U.S. Patent No. 4,522,81 1 . 

It is especially advantageous to formulate oral or parenteral compositions in dosage 

10 unit form for ease of administration and uniformity of dosage. Dosage unit form as used 
herein refers to physically discrete units suited as unitary dosages for the subject to be 
treated; each unit containing a predetermined quantity of active compound calculated to 
produce the desired therapeutic effect in association with the required pharmaceutical 
carrier. The specification for the dosage unit forms of the invention are dictated by and 

15 directly dependent on the unique characteristics of the active compound and the particular 
therapeutic effect to be achieved, and the limitations inherent in the art of compounding 
such an active compound for the treatment of individuals. 

The nucleic acid molecules of the invention can be inserted into vectors and used as 
gene therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, 

20 intravenous injection, local administration (see, e.g., U.S. Patent No. 5,328,470) or by 
stereotactic injection (see, e.g., Chen, et al, 1994. Proc. Natl Acad. Sci. USA 91: 
3054-3057). The pharmaceutical preparation of the gene therapy vector can include the 
gene therapy vector in an acceptable diluent, or can comprise a slow release matrix in 
which the gene delivery vehicle is imbedded. Alternatively, where the complete gene 

25 delivery vector can be produced intact from recombinant cells, e.g., retroviral vectors, the 
pharmaceutical preparation can include one or more cells that produce the gene delivery 
system. 

The pharmaceutical compositions can be included in a container, pack, or dispenser 
together with instructions for administration. 

30 Screening and Detection Methods 

The isolated nucleic acid molecules of the invention can be used to express NOVX 
protein (e.g., via a recombinant expression vector in a host cell in gene therapy 
applications), to detect NOVX mRNA (e.g., in a biological sample) or a genetic lesion in a 

73 



WO 03/029424 PCTAJS02/31373 

* 

NOVX gene, and to modulate NOVX activity, as described n furfhCT,'f>elow: In adSSonTAe 1 
NOVX proteins can be used to screen drugs or compounds that modulate the NOVX 
protein activity or expression as well as to treat disorders characterized by insufficient or 
excessive production of NOVX protein or production of NOVX protein forms that have 
5 decreased or aberrant activity compared to NOVX wild-type protein (e.g.; diabetes 
(regulates insulin release); obesity (binds and transport lipids); metabolic disturbances 
associated with obesity, the metabolic syndrome X as well as anorexia and wasting 
disorders associated with chronic diseases and various cancers, and infectious 
disease(possesses anti-microbial activity) and the various dyslipidemias. In addition, the 

10 anti-NOVX antibodies of the invention can be used to detect and isolate NOVX proteins 
and modulate NOVX activity. In yet a further aspect, the invention can be used in methods 
to influence appetite, absorption of nutrients and the disposition of metabolic substrates in 
both a positive and negative fashion. 

The invention further pertains to novel agents identified by the screening assays 

15 described herein and uses thereof for treatments as described, supra. 

Screening Assays 

The invention provides a method (also referred to herein as a "screening assay") for 
identifying modulators, Le., candidate or test compounds or agents {e.g., peptides, 
peptidomimetics, small molecules or other drugs) that bind to NOVX proteins or have a 
20 stimulatory or inhibitory effect on, e.g., NOVX protein expression or NOVX protein 
activity. The invention also includes compounds identified in the screening assays 
described herein. 

In one embodiment, the invention provides assays for screening candidate or test 
compounds which bind to or modulate the activity of the membrane-bound form of a 

25 NOVX protein or polypeptide or biologically-active portion thereof. The test compounds 
of the invention can be obtained using any of the numerous approaches in combinatorial 
library methods known in the art, including: biological libraries; spatially addressable 
parallel solid phase or solution phase libraries; synthetic library methods requiring 
deconvolution; the "one-bead one-compound" library method; and synthetic library 

30 methods using affinity chromatography selection. The biological library approach is 
limited to peptide libraries, while the other four approaches are applicable to peptide, 
non-peptide oligomer or small molecule libraries of compounds. See, e.g., Lam, 1997. 
Anticancer Drug Design 1 2: 145 . 
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A "small molecule" as used herein, is meant to refer to a composiFdn M thaffias a" 
molecular weight of less than about 5 kD and most preferably less than about 4 WD. Small 
molecules can be, e.g., nucleic acids, peptides, polypeptides, peptidomimetics, 
carbohydrates, lipids or other organic or inorganic molecules. Libraries of chemical and/or 
5 biological mixtures, such as fungal, bacterial, or algal extracts, are known in the art and can 
be screened with any of the assays of the invention. 

Examples of methods for the synthesis of molecular libraries can be found in the 
art, for example in: DeWitt, et al, 1993. Proc. Natl Acad ScL USA. 90: 6909; Erb, et al y 
1994. Proc. Natl Acad. Set USA. 91:1 1422; Zuckermann, et al. 7 1994. J. Med Cherru 37: 

10 2678; Cho, et al, 1993. Science 261: 1303;>Carrell, et al, 1994. Angew. Chem. Int. Ed. 
Engl. 33: 2059; Carell, et al. y 1994. Angew. Chem. Int. Ed Engl. 33: 2061; and Gallop, et 
ah, 1994. /. Med. Chem. 37: 1233. 

Libraries of compounds may be presented in solution (e.g., Houghten, 1992. 
Biotechniques 13: 412-421), or on beads (Lam, 1991. Nature 354: 82-84), on chips (Fodor, 

15 1993. Nature 364: 555-556), bacteria (Ladner, U.S. Patent No. 5,223,409), spores (Ladner, 
U.S. Patent 5,233,409), plasmids (Cull, etal, 1992. Proc. Natl Acad Sci. USA 89: 
1865-1869) or on phage (Scott and Smith, 1990. Science 249: 386-390; Devlin, 1990. 
Science 249: 404-406; Cwirla, et al., 1990. Proc. Natl Acad Sci. USA. 87: 6378-6382; 
Felici, 1991. /. Mol Biol. 222: 301-310; Ladner, U.S. Patent No. 5,233,409.). 

20 In one embodiment, an assay is a cell-based assay in which a cell which expresses a 

membrane-bound form of NOVX protein, or a biologically-active portion thereof, on the 
cell surface is contacted with a test compound and the ability of the test compound to bind 
to a NOVX protein determined. The cell, for example, can of mammalian origin or a yeast 
cell. Determining the ability of the test compound to bind to the NOVX protein can be 

25 accomplished, for example, by coupling the test compound with a radioisotope or 
enzymatic label such that binding of the test compound to the NOVX protein or 
biologically-active portion thereof can be determined by detecting the labeled compound in 

125 35 14 3 • 

a complex. For example, test compounds can be labeled with 3D S, "C, or J H, either 
directly or indirectly, and the radioisotope detected by direct counting of radioemission or 
30 by scintillation counting. Alternatively, test compounds can be enzymatically-labeled with, 
for example, horseradish peroxidase, alkaline phosphatase, or luciferase, and the enzymatic 
label detected by determination of conversion of an appropriate substrate to product. In 
one embodiment, the assay comprises contacting a cell which expresses a membrane-bound 
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form of NOVX protein, or a biologically-active portion thereof, D oh tfie cfeff sWac^v/itlTa^ 
known compound which binds NOVX to form an assay mixture, contacting the assay 
mixture with a test compound, and determining the ability of the test compound to interact 
with a NOVX protein, wherein determining the ability of the test compound to interact with 
5 a NOVX protein comprises determining the ability of the test compound to preferentially 
bind to NOVX protein or a biologically-active portion thereof as compared to the known 
compound. 

In another embodiment, an assay is a cell-based assay comprising contacting a cell 
expressing a membrane-bound form of NOVX protein, or a biologically-active portion 

10 thereof, on the cell surface with a test compound and determining the ability of the test 
compound to modulate {e.g., stimulate or inhibit) the activity of the NOVX protein or 
biologically-active portion thereof. Determining the ability of the test compound to 
modulate the activity of NOVX or a biologically-active portion thereof can be 
accomplished, for example, by determining the ability of the NOVX protein to bind to or 

15 interact with a NOVX target molecule. As used herein, a "target molecule" is a molecule 
with which a NOVX protein binds or interacts in nature, for example, a molecule on the 
surface of a cell which expresses a NOVX interacting protein, a molecule on the surface of 
a second cell, a molecule in the extracellular milieu, a molecule associated with the internal 
surface of a cell membrane or a cytoplasmic molecule. A NOVX target molecule can be a 

20 non-NOVX molecule or a NOVX protein or polypeptide of the invention. In one 

embodiment, a NOVX target molecule is a component of a signal transduction pathway 
that facilitates transduction of an extracellular signal {e.g. a signal generated by binding of 
a compound to a membrane-bound NOVX molecule) through the cell membrane and into 
the cell. The target, for example, can be a second intercellular protein that has catalytic 

25 activity or a protein that facilitates the association of downstream signaling molecules with 
NOVX. 

Determining the ability of the NOVX protein to bind to or interact with a NOVX 
target molecule can be accomplished by one of the methods described above for 
determining direct binding. In one embodiment, determining the ability of the NOVX 
30 protein to bind to or interact with a NOVX target molecule can be accomplished by 
determining the activity of the target molecule. For example, the activity of the target 
molecule can be determined by detecting induction of a cellular second messenger of the 
target {Le. intracellular Ca 2+ , diacylglycerol, IP3, etc.), detecting catalytic/enzymatic 
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activity of the target an appropriate substrate, detecting tHe inMctioiTb , rarepyrter , ^ng Mr 
(comprising a NOVX-responsive regulatory element operatively linked to a nucleic acid 
encoding a detectable marker, e.g., luciferase), or detecting a cellular response, for 
example, cell survival, cellular differentiation, or cell proliferation. 
5 In yet another embodiment, an assay of the invention is a cell-free assay comprising 

contacting a NOVX protein or biologically-active portion thereof with a test compound and 
determining the ability of the test compound to bind to the NOVX protein or 
biologically-active portion thereof. Binding of the test compound to the NOVX protein can 
be determined either directly or indirectly as described above. In one such embodiment, 

10 the assay comprises contacting the NOVX protein or biologically-active portion thereof 
with a known compound which binds NOVX to form an assay mixture, contacting the 
assay mixture with a test compound, and determining the ability of the test compound to 
interact with a NOVX protein, wherein determining the ability of the test compound to 
interact with a NOVX protein comprises determining the ability of the test compound to 

15 preferentially bind to NOVX or biologically-active portion thereof as compared to the 
known compound. 

In still another embodiment, an assay is a cell-free assay comprising contacting 
NOVX protein or biologically-active portion thereof with a test compound and determining 
the ability of the test compound to modulate {e.g. stimulate or inhibit) the activity of the 

20 NOVX protein or biologically-active portion thereof. Determining the ability of the test 
compound to modulate the activity of NOVX can be accomplished, for example, by 
determining the ability of the NOVX protein to bind to a NOVX target molecule by one of 
the methods described above for determining direct binding. In an alternative embodiment, 
determining the ability of the test compound to modulate the activity of NOVX protein can 

25 be accomplished by determining the ability of the NOVX protein further modulate a 
NOVX target molecule. For example, the catalytic/enzymatic activity of the target 
molecule on an appropriate substrate can be determined as described, supra. 

In yet another embodiment, the cell-free assay comprises contacting the NOVX 
protein or biologically-active portion thereof with a known compound which binds NOVX 

30 protein to form an assay mixture, contacting the assay mixture with a test compound, and 
determining the ability of the test compound to interact with a NOVX protein, wherein 
determining the ability of the test compound to interact with a NOVX protein comprises 
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determining the ability of the NOVX protein to preferentfiltybSria fe F 6f MMat^tht 
activity of a NOVX target molecule. 

The cell-free assays of the invention are amenable to use of both the soluble form or 
the membrane-bound form of NOVX protein. In the case of cell-free assays comprising the 
5 membrane-bound form of NOVX protein, it may be desirable to utilize a solubilizdng agent 
such that the membrane-bound form of NOVX protein is maintained in solution. Examples 
of such solubilizing agents include non-ionic detergents such as n-octylglucoside, 
n-dodecylglucoside, n-dodecylmaltoside, octanoyl-N-methylglucamide, 
decanoyl-N-methylglucamide, Triton® X-100, Triton® X-114, Thesit®, 

10 Isotridecypoly(ethylene glycol ether) n , N-dodecyl--N,N-dimethyl-3-ammonio-l-propane 
sulfonate, 3-(3-cholamidopropyl) dimethylamminiol-1 -propane sulfonate (CHAPS), or 
3-(3-cholamidopropyl)dimethylamminiol-2-hydroxy-l-propane sulfonate (CHAPSO). 

In more than one embodiment of the above assay methods of the invention, it may 
be desirable to immobilize either NOVX protein or its target molecule to facilitate 

15 separation of complexed from uncomplexed forms of one or both of the proteins, as well as 
to accommodate automation of the assay. Binding of a test compound to NOVX protein, or 
interaction of NOVX protein with a target molecule in the presence and absence of a 
candidate compound, can be accomplished in any vessel suitable for containing the 
reactants. Examples of such vessels include microtiter plates, test tubes, and 

20 micro-centrifuge tubes. In one embodiment, a fusion protein can be provided that adds a 
domain that allows one or both of the proteins to be bound to a matrix. For example, 
GST-NO VX fusion proteins or GST-target fusion proteins can be adsorbed onto 
glutathione sepharose beads (Sigma Chemical, St. Louis, MO) or glutathione derivatized 
microtiter plates, that are then combined with the test compound or the test compound and 

25 either the non-adsorbed target protein or NOVX protein, and the mixture is incubated under 
conditions conducive to complex formation (e.g., at physiological conditions for salt and 
pH). Following incubation, the beads or microtiter plate wells are washed to remove any 
unbound components, the matrix immobilized in the case of beads, complex determined 
either directly or indirectly, for example, as described, supra. Alternatively, the complexes 

30 can be dissociated from the matrix, and the level of NOVX protein binding or activity 
determined using standard techniques. 

Other techniques for immobilizing proteins on matrices can also be used in the 
screening assays of the invention. For example, either the NOVX protein or its target 
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molecule can be immobilized utilizing conjugation of biolnfand strepiavfdin." BiofinylaMd 
NOVX protein or target molecules can be prepared from biotin-NHS 
(N-hydroxy-succinimide) using techniques well-known within the art (e.g., biotinylation 
kit, Pierce Chemicals, Rockford, HI.), and immobilized in the wells of streptavidin-coated 
5 96 well plates (Pierce Chemical). Alternatively, antibodies reactive with NOVX protein or 
target molecules, but which do not interfere with binding of the NOVX protein to its target 
molecule, can be derivatized to the wells of the plate, and unbound target or NOVX protein 
trapped in the wells by antibody conjugation. Methods for detecting such complexes, in 
addition to those described above for the GST-immobilized complexes, include 

10 immunodetection of complexes using antibodies reactive with the NOVX protein or target 
molecule, as well as enzyme-linked assays that rely on detecting an enzymatic activity 
associated with the NOVX protein or target molecule. 

In another embodiment, modulators of NOVX protein expression are identified in a 
method wherein a cell is contacted with a candidate compound and the expression of 

15 NOVX mRNA or protein in the cell is determined. The level of expression of NOVX 
mRNA or protein in the presence of the candidate compound is compared to the level of 
expression of NOVX mRNA or protein in the absence of the candidate compound. The 
candidate compound can then be identified as a modulator of NOVX mRNA or protein 
expression based upon this comparison. For example, when expression of NOVX mRNA 

20 or protein is greater (i.e., statistically significantly greater) in the presence of the candidate 
compound than in its absence, the candidate compound is identified as a stimulator of 
NOVX mRNA or protein expression. Alternatively, when expression of NOVX mRNA or 
protein is less (statistically significantly less) in the presence of the candidate compound 
than in its absence, the candidate compound is identified as an inhibitor of NOVX mRNA 

25 or protein expression. The level of NOVX mRNA or protein expression in the cells can be 
determined by methods described herein for detecting NOVX mRNA or protein. 

In yet another aspect of the invention, the NOVX proteins can be used as "bait 
proteins" in a two-hybrid assay or three hybrid assay (see, e.g., U.S. Patent No. 5,283,317; 
Zervos, et al, 1993. Cell 72: 223-232; Madura, et al, 1993. /. Biol Chem. 268: 

30 12046-12054; Bartel, et al, 1993. Biotechniques 14: 920-924; Iwabuchi, et al, 1993. 

Oncogene 8: 1693-1696; and Brent WO 94/10300), to identify other proteins that bind to or 
interact with NOVX ("NOVX-binding proteins" or "NOVX-bp") and modulate NOVX 
activity. Such NOVX-binding proteins are also involved in the propagation of signals by 

79 



WO 03/029424 PCT/US02/31373 

the NOVX proteins as, for example, upstream or downstream elements oFthe NOVX " 
pathway. 

The two-hybrid system is based on the modular nature of most transcription factors, 
which consist of separable DNA-binding and activation domains. Briefly, the assay utilizes 
5 two different DNA constructs. In one construct, the gene that codes for NOVX is fused to a 
gene encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In 
the other construct, a DNA sequence, from a library of DNA sequences, that encodes an 
unidentified protein ("prey" or "sample") is fused to a gene that codes for the activation 
domain of the known transcription factor. If the "bait" and the "prey" proteins are able to 

10 interact, in vivo, forming a NOVX-dependent complex, the DNA-binding and activation 
domains of the transcription factor are brought into close proximity. This proximity allows 
transcription of a reporter gene (e.g., LacZ) that is operably linked to a transcriptional 
regulatory site responsive to the transcription factor. Expression of the reporter gene can 
be detected and cell colonies containing the functional transcription factor can be isolated 

15 and used to obtain the cloned gene that encodes the protein which interacts with NOVX. 

The invention further pertains to novel agents identified by the aforementioned 
screening assays and uses thereof for treatments as described herein. 

Detection Assays 

Portions or fragments of the cDNA sequences identified herein (and the 
20 corresponding complete gene sequences) can be used in numerous ways as polynucleotide 
reagents. By way of example, and not of limitation, these sequences can be used to: (i) 
map their respective genes on a chromosome; and, thus, locate gene regions associated with 
genetic disease; (it) identify an individual from a minute biological sample (tissue typing); 
and (Hi) aid in forensic identification of a biological sample. Some of these applications 
25 are described in the subsections, below. 

Chromosome Mapping 

Once the sequence (or a portion of the sequence) of a gene has been isolated, this 
sequence can be used to map the location of the gene on a chromosome. This process is 
called chromosome mapping. Accordingly, portions or fragments of the NOVX sequences 
30 of SEQ ID NO:2/i-l, wherein n is an integer between 1 and 124, or fragments or derivatives 
thereof, can be used to map the location of the NOVX genes, respectively, on a 
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chromosome. The mapping of the NOVX sequences to ci&drhosdmes 1$ M Wpo'ifanf first 
. step in correlating these sequences with genes associated with disease. 

Briefly, NOVX genes can be mapped to chromosomes by preparing PCR primers 
(preferably 15-25 bp in length) from the NOVX sequences. Computer analysis of the 
5 NOVX, sequences can be used to rapidly select primers that do not span more than one 
exon in the genomic DNA, thus complicating the amplification process. These primers can 
then be used for PCR screening of somatic cell hybrids containing individual human 
chromosomes. Only those hybrids containing the human gene corresponding to the NOVX 
sequences will yield an amplified fragment. 

10 Somatic cell hybrids are prepared by fusing somatic cells from different mammals 

(e.g., human and mouse cells). As hybrids of human and mouse cells grow and divide, they 
gradually lose human chromosomes in random order, but retain the mouse chromosomes. 
By using media in which mouse cells cannot grow, because they lack a particular enzyme, 
but in which human cells can, the one human chromosome that contains the gene encoding 

15 the needed enzyme will be retained. By using various media, panels of hybrid cell lines 
can be established. Each cell line in a panel contains either a single human chromosome or 
a small number of human chromosomes, and a full set of mouse chromosomes, allowing 
easy mapping of individual genes to specific human chromosomes. See, e.g., DEustachio, 
et al 9 1983. Science 220: 919-924. Somatic cell hybrids containing only fragments of 

20 human chromosomes can also be produced by using human chromosomes with 
translocations and deletions. 

PCR mapping of somatic cell hybrids is a rapid procedure for assigning a particular 
sequence to a particular chromosome. Three or more sequences can be assigned per day 
using a single thermal cycler. Using the NOVX sequences to design oligonucleotide 

25 primers, sub-localization can be achieved with panels of fragments from specific 
chromosomes. 

Fluorescence in situ hybridization (HSH) of a DNA sequence to a metaphase 
chromosomal spread can further be used to provide a precise chromosomal location in one 
step. Chromosome spreads can be made using cells whose division has been blocked in 
30 metaphase by a chemical like colcemid that disrupts the mitotic spindle. The chromosomes 
can be treated briefly with trypsin, and then stained with Giemsa. A pattern of light and 
dark bands develops on each chromosome, so that the chromosomes can be identified 
individually. The FISH technique can be used with a DNA sequence as short as 500 or 600 
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bases. However, clones larger than 1,000 bases have a high&T lfk6liK6*6df Sf felrfdiHflfef * 
unique chromosomal location with sufficient signal intensity for simple detection. 
Preferably 1,000 bases, and more preferably 2,000 bases, will suffice to get good results at 
a reasonable amount of time. For a review of this technique, see, Verma, et al> Human 
5 Chromosomes: A Manual of Basic Techniques (Pergamon Press, New York 1988). 
Reagents for chromosome mapping can be used individually to mark a single 
chromosome or a single site on that chromosome, or panels of reagents can be used for 
marking multiple sites and/or multiple chromosomes. Reagents corresponding to 
noncoding regions of the genes actually are preferred for mapping puiposes. Coding 

10 sequences are more likely to be conserved within gene families, thus increasing the chance 
of cross hybridizations during chromosomal mapping. 

Once a sequence has been mapped to a precise chromosomal location, the physical 
position of the sequence on the chromosome can be correlated with genetic map data. Such 
data are found, e.g., in McKusick, Mendeuan Inheritance in Man, available on-line 

15 through Johns Hopkins University Welch Medical Library). The relationship between 

genes and disease, mapped to the same chromosomal region, can then be identified through 
linkage analysis (co-inheritance of physically adjacent genes), described in, e.g., Egeland, 
et al, 1987. Nature, 325: 783-787. 

Moreover, differences in the DNA sequences between individuals affected and 

20 unaffected with a disease associated with the NOVX gene, can be determined. If a 

mutation is observed in some or all of the affected individuals but not in any unaffected 
individuals, then the mutation is likely to be the causative agent of the particular disease. 
Comparison of affected and unaffected individuals generally involves first looking for 
structural alterations in the chromosomes, such as deletions or translocations that are 

25 visible from chromosome spreads or detectable using PCR based on that DNA sequence. 
Ultimately, complete sequencing of genes from several individuals can be performed to 
confirm the presence of a mutation and to distinguish mutations from polymorphisms. 

Tissue Typing 

The NOVX sequences of the invention can also be used to identify individuals from 
30 minute biological samples. In this technique, an individual's genomic DNA is digested 
with one or more restriction enzymes, and probed on a Southern blot to yield unique bands 
for identification. The sequences of the invention are useful as additional DNA markers for 
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RELP ("restriction fragment length polymorphisms," described in U.S. Patent No. 
5,272,057). 

Furthermore, the sequences of the invention can be used to provide an alternative 
technique that determines the actual base-by-base DNA sequence of selected portions of an 
5 individual's genome. Thus, the NOVX sequences described herein can be used to prepare 
two PCR primers from the 5 - and 3 -termini of the sequences. These primers can then be 
used to amplify an individual's DNA and subsequently sequence it. 

Panels of corresponding DNA sequences from individuals, prepared in this manner, 
can provide unique individual identifications, as each individual will have a unique set of 

10 such DNA sequences due to allelic differences. The sequences of the invention can be used 
to obtain such identification sequences from individuals and from tissue. The NOVX 
sequences of the invention uniquely represent portions of the human genome. Allelic 
variation occurs to some degree in the coding regions of these sequences, and to a greater 
degree in the noncoding regions. It is estimated that allelic variation between individual 

15 humans occurs with a frequency of about once per each 500 bases. Much of the allelic 
variation is due to single nucleotide polymorphisms (SNPs), which include restriction 
fragment length polymorphisms (RFLPs). 

Each of the sequences described herein can, to some degree, be used as a standard 
against which DNA from an individual can be compared for identification purposes. 

20 Because greater numbers of polymorphisms occur in the noncoding regions, fewer 
sequences are necessary to differentiate individuals. The noncoding sequences can 
comfortably provide positive individual identification with a panel of perhaps 10 to 1,000 
primers that each yield a noncoding amplified sequence of 100 bases. If coding sequences, 
such as those of SEQ ID NO:2n-l, wherein n is an integer between 1 and 124, are used, a 

25 more appropriate number of primers for positive individual identification would be 
500-2,000. 

Predictive Medicine 

The invention also pertains to the field of predictive medicine in which diagnostic 
assays, prognostic assays, pharmacogenomics, and monitoring clinical trials are used for 
30 prognostic (predictive) purposes to thereby treat an individual prophylactically. 

Accordingly, one aspect of the invention relates to diagnostic assays for determining 
NOVX protein and/or nucleic acid expression as well as NOVX activity, in the context of a 
biological sample (e.g., blood, serum, cells, tissue) to thereby determine whether an 
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individual is afflicted with a disease or disorder, or is at rislc of deveTopirig a cfisof3Sr, 
associated with aberrant NOVX expression or activity. The disorders include tnetabolic 
disorders, diabetes, obesity, infectious disease, anorexia, cancer-associated cachexia, 
cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, immune 
5 disorders, and hematopoietic disorders, and the various dyslipidemias, metabolic 

disturbances associated with obesity, the metabolic syndrome X and wasting disorders 
associated with chronic diseases and various cancers. The invention also provides for 
prognostic (or predictive) assays for determining whether an individual is at risk of 
developing a disorder associated with NOVX protein, nucleic acid expression or activity. 

10 For example, mutations in a NOVX gene can be assayed in a biological sample. Such 

assays can be used for prognostic or predictive purpose to thereby prophylactically treat an 
individual prior to the onset of a disorder characterized by or associated with NOVX 
protein, nucleic acid expression, or biological activity. 

Another aspect of the invention provides methods for determining NOVX protein, 

15 nucleic acid expression or activity in an individual to thereby select appropriate therapeutic 
or prophylactic agents for that individual (referred to herein as "pharmacogenomics"). 
Pharmacogenomics allows for the selection of agents {e.g., drugs) for therapeutic or 
prophylactic treatment of an individual based on the genotype of the individual (e.g., the 
genotype of the individual examined to determine the ability of the individual to respond to 

20 a particular agent.) 

Yet another aspect of the invention pertains to monitoring the influence of agents 
(e.g., drugs, compounds) on the expression or activity of NOVX in clinical trials. 

These and other agents are described in further detail in the following sections. 

Diagnostic Assays 

25 An exemplary method for detecting the presence or absence of NOVX in a 

biological sample involves obtaining a biological sample from a test subject and contacting 
the biological sample with a compound or an agent capable of detecting NOVX protein or 
nucleic acid (e.g., mRNA, genomic DNA) that encodes NOVX protein such that the 
presence of NOVX is detected in the biological sample. An agent for detecting NOVX 

30 mRNA or genomic DNA is a labeled nucleic acid probe capable of hybridizing to NOVX 
mRNA or genomic DNA. The nucleic acid probe can be, for example, a fulHength NOVX 
nucleic acid, such as the nucleic acid of SEQ ID NO:2n-l, wherein n is an integer between 
1 and 124, or a portion thereof, such as an oligonucleotide of at least 15, 30, 50, 100, 250 or 
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500 nucleotides in length and sufficient to specifically hyBridize'under sffing^it conditions " 
to NOVX mRNA or genomic DNA. Other suitable probes for use in the diagnostic assays 
of the invention are described herein. 

An agent for detecting NOVX protein is an antibody capable of binding to NOVX 
5 protein, preferably an antibody with a detectable label. Antibodies can be polyclonal, or 
more preferably, monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or 
F(ab ! )2) can be used. The term "labeled", with regard to the probe or antibody, is intended 
to encompass direct labeling of the probe or antibody by coupling (i.e., physically linking) 
a detectable substance to the probe or antibody, as well as indirect labeling of the probe or 

10 antibody by reactivity with another reagent that is directly labeled. Examples of indirect 
labeling include detection of a primary antibody using a fluorescently-labeled secondary 
antibody and end-labeling of a DNA probe with biotin such that it can be detected with 
fluorescently-labeled streptavidin. The term "biological sample" is intended to include 
tissues, cells and biological fluids isolated from a subject, as well as tissues, cells and fluids 

15 present within a subject. That is, the detection method of the invention can be used to 
detect NOVX mRNA, protein, or genomic DNA in a biological sample in vitro as well as 
in vivo. For example, in vitro techniques for detection of NOVX mRNA include Northern 
hybridizations and in situ hybridizations. In vitro techniques for detection of NOVX 
protein include enzyme linked immunosorbent assays (ELISAs), Western blots, 

20 immunoprecipitations, and immunofluorescence. In vitro techniques for detection of 
NOVX genomic DNA include Southern hybridizations. Furthermore, in vivo techniques 
for detection of NOVX protein include introducing into a subject a labeled anti-NOVX 
antibody. For example, the antibody can be labeled with a radioactive marker whose 
presence and location in a subject can be detected by standard imaging techniques. 

25 In one embodiment, the biological sample contains protein molecules from the test 

subject. Alternatively, the biological sample can contain mRNA molecules from the test 
subject or genomic DNA molecules from the test subject. A preferred biological sample is 
a peripheral blood leukocyte sample isolated by conventional means from a subject. 

In another embodiment, the methods further involve obtaining a control biological 

30 sample from a control subject, contacting the control sample with a compound or agent 
capable of detecting NOVX protein, mRNA, or genomic DNA, such that the presence of 
NOVX protein, mRNA or genomic DNA is detected in the biological sample, and 
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comparing the presence of NOVX protein, mRNA or genomic KMfinffie control sample 
with the presence of NOVX protein, mRNA or genomic DNA in the test sample. 

The invention also encompasses kits for detecting the presence of NOVX in a 
biological sample. For example, the kit can comprise: a labeled compound or agent 
capable of detecting NOVX protein or mRNA in a biological sample; means for 
determining the amount of NOVX in the sample; and means for comparing the amount of 
NOVX in the sample with a standard. The compound or agent can be packaged in a 
suitable container. The kit can further comprise instructions for using the kit to detect 
NOVX protein or nucleic acid. 

Prognostic Assays 

The diagnostic methods described herein can furthermore be utilized to identify 
subjects having or at risk of developing a disease or disorder associated with aberrant 
NOVX expression or activity. For example, the assays described herein, such as the 
preceding diagnostic assays or the following assays, can be utilized to identify a subject 
having or at risk of developing a disorder associated with NOVX protein, nucleic acid 
expression or activity. Alternatively, the prognostic assays can be utilized to identify a 
subject having or at risk for developing a disease or disorder. Thus, the invention provides 
a method for identifying a disease or disorder associated with aberrant NOVX expression 
or activity in which a test sample is obtained from a subject and NOVX protein or nucleic 
acid (e.g., mRNA, genomic DNA) is detected, wherein the presence of NOVX protein or 
nucleic acid is diagnostic for a subject haying or at risk of developing a disease or disorder 
associated with aberrant NOVX expression or activity. As used herein, a "test sample" 
refers to a biological sample obtained from a subject of interest. For example, a test sample 
can be a biological fluid (e.g., serum), cell sample, or tissue. 

Furthermore, the prognostic assays described herein can be used to determine 
whether a subject can be administered an agent (e.g., an agonist, antagonist, 
peptidomimetic, protein, peptide, nucleic acid, small molecule, or other drug candidate) to 
treat a disease or disorder associated with aberrant NOVX expression or activity. For 
example, such methods can be used to determine whether a subject can be effectively 
treated with an agent for a disorder. Thus, the invention provides methods for determining 
whether a subject can be effectively treated with an agent for a disorder associated with 
aberrant NOVX expression or activity in which a test sample is obtained and NOVX 
protein or nucleic acid is detected (e.g., wherein the presence of NOVX protein or nucleic 

86 



WO 03/029424 PC17TJS02/31373 

acid is diagnostic for a subject that can be administered the agent fb treafallisorder " 
associated with aberrant NOVX expression or activity). 

The methods of the invention can also be used to detect genetic lesions in a NOVX 
gene, thereby determining if a subject with the lesioned gene is at risk for a disorder 
5 characterized by aberrant cell proliferation and/or differentiation. In various embodiments, 
the methods include detecting, in a sample of cells from the subject, the presence or 
absence of a genetic lesion characterized by at least one of an alteration affecting the 
integrity of a gene encoding a NOVX-protein, or the misexpression of the NOVX gene. 
For example, such genetic lesions can be detected by ascertaining the existence of at least 

10 one of: (i) a deletion of one or more nucleotides from a NOVX gene; (it) an addition of one 
or more nucleotides to a NOVX gene; (Hi) a substitution of one or more nucleotides of a 
NOVX gene, (£v) a chromosomal rearrangement of a NOVX gene; (v) an alteration in the 
level of a messenger RNA transcript of a NOVX gene, (vi) aberrant modification of a 
NOVX gene, such as of the methylation pattern of the genomic DNA, (vii) the presence of 

15 a non-wild-type splicing pattern of a messenger RNA transcript of a NOVX gene, (viii) a 
non-wild-type level of a NOVX protein, (ix) allelic loss of a NOVX gene, and (x) 
inappropriate post-translational modification of a NOVX protein. As described herein, 
there are a large number of assay techniques known in the art which can be used for 
detecting lesions in a NOVX gene. A prefeired biological sample is a peripheral blood 

20 leukocyte sample isolated by conventional means from a subject. However, any biological 
sample containing nucleated cells may be used, including, for example, buccal mucosal 
cells. 

In certain embodiments, detection of the lesion involves the use of a probe/primer in 
a polymerase chain reaction (PCR) (see, e.g., U.S. Patent Nos. 4,683,195 and 4,683,202), 

25 such as anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR) 
(see, e.g., Landegran, etal, 1988. Science 241: 1077-1080; andNakazawa, etal, 1994. 
Proc. Natl. Acad. Set USA 91: 360-364), the latter of which can be particularly useful for 
detecting point mutations in the NOVX-gene (see, Abravaya, et al 9 1995. NucL Acids Res. 
23: 675-682). This method can include the steps of collecting a sample of cells from a 

30 patient, isolating nucleic acid (e.g., genomic, mRNA or both) from the cells of the sample, 
contacting the nucleic acid sample with one or more primers that specifically hybridize to a 
NOVX gene under conditions such that hybridization and amplification of the NOVX gene 
(if present) occurs, and detecting the presence or absence of an amplification product, or 
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detecting the size of the amplification product and compatririg the length to aconffoF 
sample. It is anticipated that PCR and/or LCR may be desirable to use as a preliminary 
amplification step in conjunction with any of the techniques used for detecting mutations 
described herein. 

Alternative amplification methods include: self sustained sequence replication {see, 
Guatelli, et al, 1990. Proc. Natl Acad. Sci. USA 87: 1874-1878), transcriptional 
amplification system (see, Kwoh, etal, 1989. Proc. Natl. Acad. ScL USA 86: 1173-1177); 
QP Replicase (see r Lizardi, et al, 1988. BioTechnology 6: 1197), or any other nucleic acid 
amplification method, followed by the detection of the amplified molecules using 
techniques well known to those of skill in the art. These detection schemes are especially 
useful for the detection of nucleic acid molecules if such molecules are present in very low 
numbers. 

In an alternative embodiment, mutations in a NOVX gene from a sample cell can be 
identified by alterations in restriction enzyme cleavage patterns. For example, sample and 
control DNA is isolated, amplified (optionally), digested with one or more restriction 
endonucleases, and fragment length sizes are determined by gel electrophoresis and 
compared. Differences in fragment length sizes between sample and control DNA 
indicates mutations in the sample DNA. Moreover, the use of sequence specific ribozymes 
(see, e.g., U.S. Patent No. 5,493,531) can be used to score for the presence of specific 
mutations by development or loss of a ribozyme cleavage site. 

In other embodiments, genetic mutations in NOVX can be identified by hybridizing 
a sample and control nucleic acids, e.g., DNA or RNA, to high-density arrays containing 
hundreds or thousands of oligonucleotides probes. See, e.g., Cronin, et al, 1996. Human 
Mutation 7: 244-255; Kozal, et al, 1996. Nat. Med. 2: 753-759. For example, genetic 
mutations in NOVX can be identified in two dimensional arrays containing light-generated 
DNA probes as described in Cronin, et al, supra. Briefly, a first hybridization array of 
probes can be used to scan through long stretches of DNA in a sample and control to 
identify base changes between the sequences by making linear arrays of sequential 
overlapping probes. This step allows the identification of point mutations. This is 
followed by a second hybridization array that allows the characterization of specific 
mutations by using smaller, specialized probe arrays complementary to all variants or 
mutations detected. Each mutation array is composed of parallel probe sets, one 
complementary to the wild-type gene and the other complementary to the mutant gene. 
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In yet another embodiment, any of a variety of sequencing reactions known in the" 
art can be used to directly sequence the NOVX gene and detect mutations by comparing the 
sequence of the sample NOVX with the corresponding wild-type (control) sequence. 
Examples of sequencing reactions include those based on techniques developed by Maxim 
and Gilbert, 1977. Proc. Natl Acad. Sci. USA 74: 560 or Sanger, 1977. Proc. Natl Acad. 
Sci USA 74: 5463. It is also contemplated that any of a variety of automated sequencing 
procedures can be utilized when performing the diagnostic assays (see, e.g., Naeve, et al. y 
1995. Biotechniques 19: 448), including sequencing by mass spectrometry (see, e.g., PCT 
International Publication No. WO 94/16101; Cohen, et al. 9 1996. Adv. Chromatography 36: 
127-162; and Griffin, etal, 1993. Appl Biochem. Biotechnol 38: 147-159). 

Other methods for detecting mutations in the NOVX gene include methods in which 
protection from cleavage agents is used to detect mismatched bases in RNA/RNA or 
RNA/DNA heteroduplexes. See, e.g., Myers, et at., 1985. Science 230: 1242. In general, 
the art technique of "mismatch cleavage" starts by providing heteroduplexes of formed by 
hybridizing (labeled) RNA or DNA containing the wild-type NOVX sequence with 
potentially mutant RNA or DNA obtained from a tissue sample. The double-stranded 
duplexes are treated with an agent that cleaves single-stranded regions of the duplex such 
as which will exist due to basepair mismatches between the control and sample strands. 
For instance, RNA/DNA duplexes can be treated with RNase and DNA/DNA hybrids 
treated with Sj nuclease to enzymatically digesting the mismatched regions. In other 
embodiments, either DNA/DNA or RNA/DNA duplexes can be treated with 
hydroxylamine or osmium tetroxide and with piperidine in order to digest mismatched 
regions. After digestion of the mismatched regions, the resultingmaterial is then separated 
by size on denaturing polyacrylamide gels to determine the site of mutation. See, e.g., 
Cotton, etal., 1988. Proc. Natl Acad. Sci. USA 85: 4397; Saleeba, etal, 1992. Methods 
Enzymol 217: 286-295. In an embodiment, the control DNA or RNA can be labeled for 
detection. 

In still another embodiment, the mismatch cleavage reaction employs one or more 
proteins that recognize mismatched base pairs in double-stranded DNA (so called "DNA 
mismatch repair" enzymes) in defined systems for detecting and mapping point mutations 
in NOVX cDNAs obtained from samples of cells. For example, the mutY enzyme of E. 
coli cleaves A at G/A mismatches and the thymidine DNA glycosylase from HeLa cells 
cleaves T at G/T mismatches. See, e.g., Hsu, et al., 1994. Carcinogenesis 15: 1657-1662. 
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According to an exemplary embodiment, a probe based on 'al^VXsequence e7g~H " 
wild-type NOVX sequence, is hybridized to a cDNA or other DNA product from a test 
cell(s). The duplex is treated with a DNA mismatch repair enzyme, arid the cleavage 
products, if any, can be detected from electrophoresis protocols or the like. See, e.g., U.S. 
5 Patent No. 5,459,039. 

In other embodiments, alterations in electrophoretic mobility will be used to 
identify mutations in NOVX genes. For example, single strand conformation 
polymorphism (SSCP) may be used to detect differences in electrophoretic mobility 
between mutant and wild type nucleic acids. See, e.g., Orita, et al. 9 1989. Proc. Natl Acad 

10 Set. USA: 86: 2766; Cotton, 1993. Mutat. Res. 285: 125-144; Hayashi, 1992. Genet. Anal 
Tech Appl. 9: 73-79. Single-stranded DNA fragments of sample and control NOVX 
nucleic acids will be denatured and allowed to renature. The secondary structure of 
single-stranded nucleic acids varies according to sequence, the resulting alteration in 
electrophoretic mobility enables the detection of even a single base change. The DNA 

15 fragments may be labeled or detected with labeled probes. The sensitivity of the assay may 
be enhanced by using RNA (rather than DNA), in which the secondary structure is more 
sensitive to a change in sequence. In one embodiment, the subject method utilizes 
heteroduplex analysis to separate double stranded heteroduplex molecules on the basis of 
changes in electrophoretic mobility. See, e.g., Keen, et al, 1991. Trends Genet 7: 5. 

20 In yet another embodiment, the movement of mutant or wild-type fragments in 

polyacrylamide gels containing a gradient of denaturant is assayed using denaturing 
gradient gel electrophoresis (DGGE). See, e.g., Myers, et al, 1985. Nature 313: 495. 
When DGGE is used as the method of analysis, DNA will be modified to insure that it does 
not completely denature, for example by adding a GC clamp of approximately 40 bp of 

25 high-melting GC-rich DNA by PCR. In a further embodiment, a temperature gradient is 
used in place of a denaturing gradient to identify differences in the mobility of control and 
sample DNA. See, e.g., Rosenbaum and Reissner, 1987. Biophys. Chenu 265: 12753. 

Examples of other techniques for detecting point mutations include, but are not 
limited to, selective oligonucleotide hybridization, selective amplification, or selective 

30 primer extension. For example, oligonucleotide primers may be prepared in which the 
known mutation is placed centrally and then hybridized to target DNA under conditions 
that permit hybridization only if a perfect match is found. See, e.g., Saiki, et al, 1986. 
Nature 324: 163; Saiki, et al, 1989. Proc. Natl Acad. Scl USA 86: 6230. Such allele 
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specific oligonucleotides are hybridized to PCR amplified Su§^ 

different mutations when the oligonucleotides are attached to the hybridizing membrane 

and hybridized with labeled target DNA. 

Alternatively, allele specific amplification technology that depends on selective 
5 PCR amplification may be used in conjunction with the instant invention. Oligonucleotides 
used as primers for specific amplification may carry the mutation of interest in the center of 
the molecule (so that amplification depends on differential hybridization; see, e.g., Gibbs, 
et al, 1989. Nucl Acids Res. 17: 2437-2448) or at the extreme 3'-terminus of one primer 
where, under appropriate conditions, mismatch can prevent, or reduce polymerase 

10 extension (see, e.g., Prossner, 1993. Tibtech. 11: 238). In addition it may be desirable to 
introduce a novel restriction site in the region of the mutation to create cleavage-based 
detection. See, e.g., Gasparini, et al, 1992. Mol. Cell Probes 6: 1. It is anticipated that in 
certain embodiments amplification may also be performed using Taq ligase for 
amplification. See, e.g., Barany, 1991. Proc. Natl. Acad. Sci. USA 88: 189. In such cases, 

15 ligation will occur only if there is a perfect match at the 3-terminus of the 5* sequence, 
malting it possible to detect the presence of a known mutation at a specific site by looking 
for the presence or absence of amplification. 

The methods described herein may be performed, for example, by utilizing 
pre-packaged diagnostic kits comprising at least one probe nucleic acid or antibody reagent 

20 described herein, which may be conveniently used, e.g., in clinical settings to diagnose 
patients exhibiting symptoms or family history of a disease or illness involving a NOVX 
gene. 

Furthermore, any cell type or tissue, preferably peripheral blood leukocytes, in 
which NOVX is expressed may be utilized in the prognostic assays described herein. 
25 However, any biological sample containing nucleated cells may be used, including, for 
example, buccal mucosal cells. 

Pharmacogenomics 

Agents, or modulators that have a stimulatory or inhibitory effect on NOVX activity 
(e.g., NOVX gene expression), as identified by a screening assay described herein can be 
30 administered to individuals to treat (prophylactically or therapeutically) disorders. The 
disorders include but are not limited to, e.g., those diseases, disorders and conditions listed 
above, and more particularly include those diseases, disorders, or conditions associated 
with homologs of a NOVX protein, such as those summarized in Table A. 
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In conjunction with such treatment, the phaimacopftoniieS (^i^^^cly^df^h^ 
relationship between an individual's genotype and that individual's response to a foreign 
compound or drug) of the individual may be considered. Differences in metabolism of 
therapeutics can lead to severe toxicity or therapeutic failure by altering the relation 
between dose and blood concentration of the pharmacologically active drug. Thus, the 
pharmacogenomics of the individual permits the selection of effective agents (e.g., drugs) 
for prophylactic or therapeutic treatments based on a consideration of the individual's 
genotype. Such pharmacogenomics can further be used to deteimine appropriate dosages 
and therapeutic regimens. Accordingly, the activity of NOVX protein, expression of 
NOVX nucleic acid, or mutation content of NOVX genes in an individual can be 
determined to thereby select appropriate agent(s) for therapeutic or prophylactic treatment 
of the individual. 

Pharmacogenomics deals with clinically significant hereditary variations in the 
response to drugs due to altered drug disposition and abnormal action in affected persons. 
See e.g., Eichelbaum, 1996. Clin. Exp. Pharmacol. Physiol, 23: 983-985; linder, 1997. 
Clin. Chem., 43: 254-266. Li general, two types of pharmacogenetic conditions can be 
differentiated. Genetic conditions transmitted as a single factor altering the way drugs act 
on the body (altered drug action) or genetic conditions transmitted as single factors altering 
the way the body acts on drugs (altered drug metabolism). These pharmacogenetic 
conditions can occur either as rare defects or as polymorphisms. For example, 
glucose-6-phosphate dehydrogenase (G6PD) deficiency is a common inherited 
enzymopathy in which the main clinical complication is hemolysis after ingestion of 
oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption of 
fava beans. 

As an illustrative embodiment, the activity of drug metabolizing enzymes is a major 
determinant of both the intensity and duration of drug action. The discovery of genetic 
polymorphisms of drug metabolizing enzymes (e.g., N-acetyltransferase 2 (NAT 2) and 
cytochrome pregnancy zone protein precursor enzymes CYP2D6 and CYP2C19) has 
provided an explanation as to why some patients do not obtain the expected drug effects or 
show exaggerated drug response and serious toxicity after taking the standard and safe dose 
of a drug. These polymorphisms are expressed in two phenotypes in the population, the 
extensive metabolizer (EM) and poor metabolizer (PM). The prevalence of PM is different 
among different populations. For example, the gene coding for CYP2D6 is highly 
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polymorphic and several mutations have been identified M$Mt\ \vh5^1lM&1o u fc tt 
absence of functional CYP2D6. Poor metabolizers of CYP2D6 and CYP2C19 quite 
frequently experience exaggerated drug response and side effects when they receive 
standard doses. If a metabolite is the active therapeutic moiety, PM show no therapeutic 
5 response, as demonstrated for the analgesic effect of codeine mediated by its 

CYP2D6-formed metabolite morphine. At the other extreme are the so called ultra-rapid 
metabolizers who do not respond to standard doses. Recently, the molecular basis of 
ultra-rapid metabolism has been identified to be due to CYP2D6 gene amplification. 

Thus, the activity of NOVX protein, expression of NOVX nucleic acid, or mutation 

10 content of NOVX genes in an individual can be determined to thereby select appropriate 
agent(s) for therapeutic or prophylactic treatment of the individual. In addition, 
pharmacogenetic studies can be used to apply genotyping of polymorphic alleles encoding 
drug-metabolizing enzymes to the identification of an individual's drug responsiveness 
phenotype. This knowledge, when applied to dosing or drug selection, can avoid adverse 

15 reactions or therapeutic failure and thus enhance therapeutic or prophylactic efficiency 
when treating a subject with a NOVX modulator, such as a modulator identified by one of 
the exemplary screening assays described herein. 

Monitoring of Effects During Clinical Trials 

Monitoring the influence of agents (e.g., drugs, compounds) on the expression or 
20 activity of NOVX (e.g., the ability to modulate aberrant cell proliferation and/or 

differentiation) can be applied not only in basic drug screening, but also in clinical trials. 
For example, the effectiveness of an agent determined by a screening assay as described 
herein to increase NOVX gene expression, protein levels, or upregulate NOVX activity, 
can be monitored in clinical trails of subjects exhibiting decreased NOVX gene expression, 
25 protein levels, or downregulated NOVX activity. Alternatively, the effectiveness of an 
agent determined by a screening assay to decrease NOVX gene expression, protein levels, 
or downregulate NOVX activity, can be monitored in clinical trails of subjects exhibiting 
increased NOVX gene expression, protein levels, or upregulated NOVX activity. In such 
clinical trials, the expression or activity of NOVX and, preferably, other genes that have 
30 been implicated in, for example, a cellular proliferation or immune disorder can be used as 
a "read out" or markers of the immune responsiveness of a particular cell. 

By way of example, and not of limitation, genes, including NOVX, that are 
modulated in cells by treatment with an agent (e.g., compound, drug or small molecule) 
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' that modulates NOVX activity (e.g., identified in a screeffiifgrags£y U^Mb&l fifenSirf? dair 
be identified. Thus, to study the effect of agents on cellular proliferation disorders, for 
example, in a clinical trial, cells can be isolated and RNA prepared and analyzed for the 
levels of expression of NOVX and other genes implicated in the disorder. The levels of 
5 gene expression (i.e., a gene expression pattern) can be quantified by Northern blot analysis 
or RT-PCR, as described herein, or alternatively by measuring the amount of protein 
produced, by one of the methods as described herein, or by measuring the levels of activity 
of NOVX or other genes. In this manner, the gene expression pattern can serve as a 
marker, indicative of the physiological response of the cells to the agent. Accordingly, this 

10 response state may be determined before, and at various points during, treatment of the 
individual with the agent. 

In one embodiment, the invention provides a method for monitoring the 
effectiveness of treatment of a subject with an agent {e.g., an agonist, antagonist, protein, 
peptide, peptidomimetic, nucleic acid, small molecule, or other drug candidate identified by 

15 the screening assays described herein) comprising the steps of (i) obtaining a 

pre-administration sample from a subject prior to administration of the agent; (zz) detecting 
the level of expression of a NOVX protein, mRNA, or genomic DNA in the 
preadministration sample; (Hi) obtaining one or more post-administration samples from the 
subject; (fv) detecting the level of expression or activity of the NOVX protein, mRNA, or 

20 genomic DNA in the post-administration samples; (v) comparing the level of expression or 
activity of the NOVX protein, mRNA, or genomic DNA in the pre-administration sample 
with the NOVX protein, mRNA, or genomic DNA in the post administration sample or 
samples; and (vi) altering the administration of the agent to the subject accordingly. For 
example, increased administration of the agent may be desirable to increase the expression 

25 or activity of NOVX to higher levels than detected, z.e., to increase the effectiveness of the 
agent. Alternatively, decreased administration of the agent may be desirable to decrease 
expression or activity of NOVX to lower levels than detected, i.e. 7 to decrease the 
effectiveness of the agent. 

Methods of Treatment 

30 The invention provides for both prophylactic and therapeutic methods of treating a 

subject at risk of (or susceptible to) a disorder or having a disorder associated with aberrant 
NOVX expression or activity. The disorders include but are not limited to, e.g., those 
diseases, disorders and conditions listed above, and more particularly include those 
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diseases, disorders, or conditions associated with homolc?gs%f & I^(3^^Bfen, sfidfc a? 
those summarized in Table A. 

These methods of treatment will be discussed more fully, below. 

Diseases and Disorders 

5 Diseases and disorders that are characterized by increased (relative to a subject not 

suffering from the disease or disorder) levels or biological activity may be treated with 
Therapeutics that antagonize {i.e., reduce or inhibit) activity. Therapeutics that antagonize 
activity may be administered in a therapeutic or prophylactic manner. Therapeutics that 
- may be utilized include, but are not limited to: (0 an aforementioned peptide, or analogs, 

10 derivatives, fragments or homologs thereof; (it) antibodies to an aforementioned peptide; 
(Hi) nucleic acids encoding an aforementioned peptide; (iv) administration of antisense 
nucleic acid and nucleic acids that are "dysfunctional" (i.e., due to a heterologous insertion 
within the coding sequences of coding sequences to an aforementioned peptide) that are 
utilized to "knockout" endogenous function of an aforementioned peptide by homologous 

15 recombination (see, e.g., Capecchi, 1989. Science 244: 1288-1292); or (v) modulators ( i.e., 
inhibitors, agonists and antagonists, including additional peptide mimetic of the invention 
or antibodies specific to a peptide of the invention) that alter the interaction between an 
aforementioned peptide and its binding partner. 

Diseases and disorders that are characterized by decreased (relative to a subject not 

20 suffering from the disease or disorder) levels or biological activity may be treated with 
Therapeutics that increase (i.e. y are agonists to) activity. Therapeutics that upregulate 
activity may be administered in a therapeutic or prophylactic manner. Therapeutics that 
may be utilized include, but are not limited to, an aforementioned peptide, or analogs, 
derivatives, fragments or homologs thereof; or an agonist that increases bioavailability. 

25 Increased or decreased levels can be readily detected by quantifying peptide and/or 

RNA, by obtaining a patient tissue sample (e.g., from biopsy tissue) and assaying it in vitro 
for RNA or peptide levels, structure and/or activity of the expressed peptides (or mRNAs 
of an aforementioned peptide). Methods that are well-known within the art include, but are 
not limited to, immunoassays (e.g., by Western blot analysis, immunoprecipitation 

30 followed by sodium dodecyl sulfate (SDS) polyacrylamide gel electrophoresis, 

immunocytochemistry, etc.) and/or hybridization assays to detect expression of mRNAs 
(e.g., Northern assays, dot blots, in situ hybridization, and the like). 
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Prophylactic Methods 

In one aspect, the invention provides a method for preventing, in a subject, a disease 
or condition associated with an aberrant NOVX expression or activity, by administering to 
the subject an agent that modulates NOVX expression or at least one NOVX activity. 
5 Subjects at risk for a disease that is caused or contributed to by aberrant NOVX expression 
or activity can be identified by, for example, any or a combination of diagnostic or 
prognostic assays as described herein. Administration of a prophylactic agent can occur 
prior to the manifestation of symptoms characteristic of the NOVX aberrancy, such that a 
disease or disorder is prevented or, alternatively, delayed in its progression. Depending 
10 upon the type of NOVX aberrancy, for example, a NOVX agonist or NOVX antagonist 
agent can be used for treating the subject. The appropriate agent can be determined based 
on screening assays described herein. The prophylactic methods of the invention are 
further discussed in the following subsections. 

Therapeutic Methods 

15 Another aspect of the invention pertains to methods of modulating NOVX 

expression or activity for therapeutic purposes. The modulatory method of the invention 
involves contacting a cell with an agent that modulates one or more of the activities of 
NOVX protein activity associated with the cell. An agent that modulates NOVX protein 
activity can be an agent as described herein, such as a nucleic acid or a protein, a 

20 naturally-occurring cognate ligand of a NOVX protein, a peptide, a NOVX 

peptidomimetic, or other small molecule. In one embodiment, the agent stimulates one or » 
more NOVX protein activity. Examples of such stimulatory agents include active NOVX 
protein and a nucleic acid molecule encoding NOVX that has been introduced into the cell. 
In another embodiment, the agent inhibits one or more NOVX protein activity. Examples 

25 of such inhibitory agents include antisense NOVX nucleic acid molecules and anti-NOVX 
antibodies. These modulatory methods can be performed in vitro {e.g., by culturing the cell 
with the agent) or, alternatively, in vivo (e.g., by administering the agent to a subject). As 
such, the invention provides methods of treating an individual afflicted with a disease or 
disorder characterized by aberrant expression or activity bf a NOVX protein or nucleic acid 

30 molecule. In one embodiment, the method involves administering an agent (e.g., an agent 
identified by a screening assay described herein), or combination of agents that modulates 
(e.g., up-regulates or down-regulates) NOVX expression or activity. Li another 
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embodiment, the method involves administering a NOVlpbt&ri oTOifc^ 
as therapy to compensate for reduced or aberrant NO VX expression or activity. 

Stimulation of NOVX activity is desirable in stations in which NOVX is 
abnoirnally downregulated and/or in which increased NOVX activity is likely to have a 
5 beneficial effect. One example of such a situation is where a subject has a disorder 

characterized by aberrant cell proliferation and/or differentiation {e.g., cancer or immune 
associated disorders). Another example of such a situation is where the subject has a 
gestational disease {e.g., preclampsia). 

Determination of the Biological Effect of the Therapeutic 

10 In various embodiments of the invention, suitable in vitro or in vivo assays are 

performed to determine the effect of a specific Therapeutic and whether its administration 
is indicated for treatment of the affected tissue. 

In various specific embodiments, in vitro assays may be performed with 
representative cells of the type(s) involved in the patient's disorder, to determine if a given 

15 Therapeutic exerts the desired effect upon the cell type(s). Compounds for use in therapy 
may be tested in suitable animal model systems including, but not limited to rats, mice, 
chicken, cows, monkeys, rabbits, and the like, prior to testing in human subjects. Similarly, 
for in vivo testing, any of the animal model system known in the art may be used prior to 
administration to human subjects. 

20 Prophylactic and Therapeutic Uses of the Compositions of the Invention 

The NOVX nucleic acids and proteins of the invention are useful in potential 
prophylactic and therapeutic applications implicated in a variety of disorders. The 
disorders include but are not limited to, e.g., those diseases, disorders and conditions listed 
above, and more particularly include those diseases, disorders, or conditions associated 
25 with homologs of a NOVX protein, such as those summarized in Table A. 

As an example, a cDNA encoding the NOVX protein of the invention may be 
useful in gene therapy, and the protein may be useful when administered to a subject in 
need thereof. By way of non-limiting example, the compositions of the invention will have 
efficacy for treatment of patients suffering from diseases, disorders, conditions and the like, 
30 including but not limited to those listed herein. 

Both the novel nucleic acid encoding the NOVX protein, and the NOVX protein of 
the invention, or fragments thereof, may also be useful in diagnostic applications, wherein 
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the presence or amount of the nucleic acid or the protein l Sr6'1o n b& aSSeWa^A flfftBSf use 
could be as an anti-bacterial molecule (i.e., some peptides have been found to possess 
anti-bacterial properties). These materials are further useful in the generation of antibodies, 
which immunospecifically-bind to the novel substances of the invention for use in 
therapeutic or diagnostic methods. 

The invention will be further described in the following examples, which do not 
limit the scope of the invention described in the claims. 

EXAMPLES 

Example A: Polynucleotide and Polypeptide Sequences, and Homology Data 
Example 1. 

The NOV1 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 1 A. 



Table 1 A. NOV1 Sequence Analysis 




SEQ ID NO: 1 |6189bp 




NOVla, 
CG106764-01 
DNA Sequence 


ATGTTGAAGTTCAAATATGGAGCGCGGAATCC TTTGGATGC TGGTGC TGCTGAACCCATTGCCAGCCG 
GGCCTCC^GGCTGAATCTCTTCTTCCAGGGGAAACCACCCTTTATGACTCAACAGCAGATGTCTCCTC 
TTTCCCGAGAAGGGATATOAGATGCCCTCTTTG 

AAGATTAAGCACGTGAGCAACTTTGTCCGGAAGTGTTCCGACACCATAGCTGAGTTACAGGAGCTCCA 
GCCTTCGGGA^GGACTTCGAkGTCAGAAGTCTTGTAGGTTGTGG 

TAAGAGAGAAAGCAACCGGGGACATCTATGCTATGAAAGTGATGAAGAAGAAGGCTTTATTGGCCCAG 

GAGCAGGTTTCATTTTTTGAGGAAGAGCGGAACATATTATCTCGAAGCACAAGCCCGTGGATCCCCCA 

ATTACAGTATGCCTTTC^GGAO^AAAATCACCTTTATCTGGTGATGGAATATCAGCCTGGAGGGGACT 

TGCTGTCACTTTTGAATAGATATGAGGACCAGTTAGATGAAAACCTGATACAGTTTTACCTAGCTGAG 

CTGATTTTGGCTGTTCACAGCGTTCATCTGATGGGATACGTGCATCGGGACATCAAGCCTGAGAACAT 

TCTCGTTGACCGCACAGGACACATCAAGCTGGTGGATTTTGGATCTGCCGCGAAAATGAATTCAAACA 

AGGTGAATGCCAAACTCCCGATTGGGACCCCAGATTACATGGCTCCTGAAGTGCTGACTGTGATGAAC 

GGGGATGGAAAAGGCACCTACGGCCTGGACTGTGACTGGTGGTCAGTGGGCGTGATTGCCTATGAGAT 

G ATTT ATGGGAG ATC CCC C T TCGCAG AG GG AACC TC TG CC AGAAC C TTC AAT AACATTATGAATTTC C 

AGCGGTTTTTGAAATTTCCAGATGACCCCAAAGTGAGCAGTGACTTTCTTGATCTGATTCAAAGCTTG 

TTGTGCGGCC^GAAAGAGAGACTGAAGTTTGAAGGTCTTTGCTGCCATCCTTTCTTCTCTAAAATTGA 

CTGGAACAACATTCGTAACGCTCC TCCCCCC TTCGTTCCCACCCTCAAGTC TGACGATGAC ACCTCC A 

ATTTTGATGAACCAGAGAAGAATTCGTGGGTTTCATCCTCTCCGTGCCAGCTGAGCCCCTCAGGCTTC 

TCGGGTGAAGAACTGCCGTTTGTGGGGTTTTCGTACAGCAAGGCACTGGGGATTCTTGGTAGATCTGA 

GTCTGTTGTGTCGGGTCTGGACTCCCCTGCCAAGACTAGCTCCATGGAAAAGAAACTTCTCATCAAAA 

GCAAAGAGCTACAAGACTCTCAGGACAAGTGTCACAAGATGGAGCAGGAAATGACCCGGTTACATCGG 

AGAGTGTCAGAGGTGGAGGCTGTGCTTAGTCAGAAGGAGGTGGAGCTGAAGGCCTCTGAGACTCAGAG 

ATCCCTCCTGGAGCAGGACCTTGCTACCTACATCACAGAATGCAGTAGCTTAAAGCGAAGTTTGGAGC 

AAGCACGGATGGAGGTGTCCCAGGAGGATGACAAAGCACTGCAGCTTCTCCATGATATCAGAGAGCAG 

AGCCGGAAGCTCCAAGAAATCAAAGAGCAGGAGTACCAGGCTCAAGTGGAAGAAATGAGGTTGATGAT 

GAATCAGOTGGAAGAGGATCTTGTCTCAGCAAGAAGACGGAGTGATCTCTACGAATCTGAGCTGAGAG 

AGTCTCGGCTTGCTGCTGAAGAATTCAAGCGGAAAGCGACAGAATGTCA.GCATAAACTGTTGAAGGCT 

AAGGATCAGGGGAAGCCTGAAGTGGGAGAATATGCGAAACTGGAGAAGATCAATGCTGAGCAGCAGCT 

CAAAATTCAGGAGCTCCAAGAGAAAC TGGAGAAGGC TGTAAAAGCCAGCACGGAGGCCACCGAGCTGC 

TGCAGAATATCCGCCAGGCAAAGGAGCGAGCCGAGAGGGAGCTGGAGAAGCTGCAGAACCGAGAGGAT 

TCTTCTGAAGGCATCAGAAAGAAGCTGGTGGAAGCTGAG(3AACGCCGCCATTCTCTGGAGAACAAGGT 

AAAGAGACTAGAGACCATGGAGCGTAGAGAAAAOIGACTGAAGGATGACATCCAGACAAAATCCCAAC 

AGATCCAGCAGATGGCTGATAAAATTCTGGAGCTCGAAGAGAAACATCGGGAGGCCCAAGTCTCAGCC 

CAGCACCTAGAAGTGCACCTGAAACAGAAAGAGCAGCACTATGAGGAAAAGATTAAAGTATTGGACAA 

TCAGATAAAGAAAGACCTGGCTGACAAGGAGACACTGGAGAACATGATGCAGAGACACGAGGAGGAGG 

CCCATGAGAAGGG(^AAATTCTCAGCGAACAGAAGGCGATGATCAATGCTATGGATTCCAAGATCAGA 

TCCCTGG^ACAGAGGATTGTGGAACTGTCTGAAGCCAATAAACTTGCAGCAAATAGCAGTC 

CGAAAGGAACATGAAGGCCC^GAAGAGATGATTTCTGAACTCAGGCAACAGAAATTTTACCTGGAGA 
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CACAGGCTGGGAAGTTGGAGGCCCAGAACCGA 

GACCACAGTG AC AAGAATC GG C TGC TGG AACTGG AGAC AAGAT TG CGGG AGGTGAGTC TAG AG CACGA 
GGAGCAGAAACTGGAGCTGAAGCGCCAGCTCACAGAGCTACAGCTCTCC^ 

AGTTGACAGCCCTGCAGGCTGCACGGGCGGCCCTGGAGAGCCAGCTTCGCCAGGCGAAGACAGAGCTG 

GAAGAGAC CAC AGCAGAAG CTGAAGAGG AG ATC CAGGC AC TCACG GC AC AT AGAGATGAAATC CAGCG 

C AAATTTGATG CTC T TCGTAACAGC TGTAC TGTGATC AC AGAC CTGGAGGAGCAGC TAAACC AGC TGA 

CCGAGGACAACGCTGAACTCAACAACCAAAACTTCTACTTGTCCAAACAACTCGATGAGGCTTCTGGC 

GCC^CGACGAGATTGTACAACTGCGAAGTGAAGTGGACCATCTCCGCCGGGAGATCACGGAACGAGA 

GATGCAGCTTACCAGCCAGAAGCAAACGATGGAGGCTCTGAAGACCACGTGCACCATGCTGGAGGAAC 

AGGTCATGGATTTGGAGGCCCTAAACGATGAGCTGCTAGAAAAAGAGCGGCAGTGGGAGGCCTGGAGG 

AGCGTCCTGGGTGATGAGAAATCCCAGTTTGAGTGTCGGGTTCGAGAGCTGCAGAGGATGCTGGACAC 

CGAGAAACAGAGCAGGGCGAGAGCCGATCAGCGGATCACCGAGTCTCGCCAGGTGGTGGAGCTGGCAG 

TGAAGGAGCAC^GGCTGAGATTCTCGCTCTGCAGCAGGCTCTCAAAGAGCAGAAGCTGAAGGCCGAG 

AGCCTCTCTGACAAGCTCAATGACCTGGAGAAGAAGCATGCTATGCTTGAAATGAATGCCCGAAG 

ACAGCAGAAGCTGGAGACTGAACGAGAGCTCAAACAGAGGCTTCTGGAAGAGCAAGCCAAATTACAGC 

AGCAGATGGACCTGCAGAAAAATCACATTTTCCGTCTGACTCAAGGACTGCAAGAAGCTCTAGATCG^ 

GCTGATCTACTGAAGAC^GAAAGAAGTGACTTGGAGTATCAGCTGGAAAACATTCAGGTGCTCTATTC 

TCATGAAAAGGTGAAAATGGAAGGCACTATTTCTCAACAAACCAAACTCATTGATTTTCTGCAAGCCA 

AAATGGACCAACCTGCTAAAAAGAAAAAGGTGCCTCTGCAGTACAATGAGCTGAAGCTGGCCCTGGAG 

AAGGAGAAAGCTCGCTGTGCAGAGCTAGAGGAAGCCCTTCAGAAGACCCGCATCGAGCTCCGGTCCGC 

C CGGG AGGAAGCTGCC CAC CGCAAAGCAACGGACCACCCACACCC ATC CACGCCAGCCACCGCGAGGC 

AGCAGATCGCCATGTCTGCCATCGTGCGGTCGCCAGAGCACCAGCCCAGTGCCATC 

CCGCCATCCAGCCGCAGAAAGGAGTCTTCAACTCCAGAGGAATTTAGTCGGCGTCTTAAGGAACGCAT 

GCACCAGAATATTCCTCACCGATTCAACGTAGGACTGAACATGCGAGCC^ 

TGGATACCGTGCACTTTGGACGCCAGGCATCCAAATGTCTAGAATGTCAGGTGATGTGTCACCCCAAG 
TGCTCCACGTGCTTOTCAGCGACCTGCGGCT 

CTGCCGTGACAAAATGAACTCCCCAGGTCTCCAGACCAAGGAGCCCAGCA 

GGTGGATG AAGG TGC CC AGGAATAACAAACGAGGACAGGAAGGC TGGG ACA TGTC C TG 
GAGGGATCAAAAGTCCTCATTTATGACAATGAAGCCAGAGAAGCTGGACAGAGGCCGGTGGAAGAATT 
TGAGCTGTGCCTTCCCGACGGGGATGTATCTATTCATGGTGCCGTTGGTGCTTCCGAACTCGCAAATA 
CAGCGAAAGCAGATGTCCCATACATACTGAAGATGGAATCT 

AGAACCCTCTACTTGCTAGCTCCGAGCTTCCCTGACAAACAGCGCTGGGTCACCGCCTO 
TGTCGCAGGTGGGAGAGTTTCTAGGGAAAAAGCAGAAGCTGATGCTAAACTGCTTGGAAACTCCCTC 
TGAAACTGGAAGGTGATGACCGTCTAGACATGAACTGCACGCTGCCC TTCAGTGAC CAGGTAGTGTTG 
GTGGGCACCGAGGAAGGGCTC TACGCCC TGAATGTCTTGAAAAACTCCCTAACCGATGTCCCAGGAAT 
TGGAGCAGTCTTCCAAATTTATATTATCAAGGACCTGGAGAAGCTACTCATGATAGCAGGTGAAGAGC 
GGGCACTGTGTCTTGTGGACGTGAAGAAAGTGAAACAGTCCCTGGCCCAGTC 
CCCGACATCTCACCGAACATTTTTGAAGCTGTCAAGGGCTGCCACOT 
GAACGGGCTCTGCATCTGTGCAGCCATGCCCAGCAAAGTC 

GCAAATACTGCATCCGGAAAGAGATAGAGACCTCAGAGCCCTGCAGCTGTATCCACTTCACCAAT 

AGTATCCTCATTGGAACCAATAAATTCTACGAAATCGACATGAAGCAGTACACGCTCGAGGAATTCCT 

GGATAAGAATGACCATTCCTTGGCACCTGCTGTGTTTGCCGCCTCTTCCAACAGCTTCCCTG 

TCGTGCAGGTGAACAGCGCAGGGCAGCGAGAGGAGTACTTGCTGTGTTTCCACGAATTTGGAGTGTTC 

G TGGATTC TTACGGAAGACGTAGC CGC AC AGACGATC TC AAGTGGAGTC GC T TACC TTTGGCCTTTGC 

CTACAGAGAACCCTATCTGTTTGTGACO^ 

CCTCAGCAGGGACCCCTGCCCGAGCGTACCTGGACATCCCGAACCCGCGCTACCTGGGCCCTGCCATT 
TCCTCAGGAGCGATTTACTTGGCGTCCTCATACCAGGATAAATTAAGGGTCATTTGCTGCAAGGGAAA 
CCTCGTGAAGGAGTCCGGCACTGAACACCACCGGGGCCCGTCCACCTCCCGCAGCAGCCCCAACAAGC 
GAGGCCCACCCACGTACAACGAGCACATCACCAAGCGCGTGGCCTCCAGCCCAGCGCCGCCCGAAGGC 
CCCAGCCACCCGCGAGAGCCAAGCACACCCCACCGCTACCGCGAGGGGCGGACCGAGCTGCGCAGGGA 
CAAGTCTCCTGGCCGCCCCCTGGAGCGAGAGAAGTCCCCCGGCCGGATGCTCAGCACGCGGAGAGAGC 
GGTCCCCCGGGAGGCTGTTTGAAGACAGCAGCAGGGGCCGGCTGCCTGCGGGAGCCGTGAGGACCCCG 
CTGTCCCAGGTGAACAAGGTGTGGGACCAGTCTTCAGTATAAATCTCAGCCAGAAAAACCAACTCCTC 
A 






ORF Start: ATG at 1 \ jORF Stop: TAA at 6160 







SEQ ID NO: 2 |2053 aa |MW at 234700:lkD 


NOVla, 
CG106764-01 
Protein 
Sequence 


MLKFKYGAKNPLDAGAAEPIASRASRLl^ 

KIKHVSNFVRKCSDTIAELQELQPSAKI)FEVRSLVGCGHFACTQ 

EQVSFFEEERNILSRSTSPWIPQLQYAFQDKNHLYLVMEYQPGGDL^^ 

LILAVHSVHLMGYVHRDIKPENILVDRTGHIKLTO 

GIX3KGTYGLDCDWWSVGVIAYEMIYGRSPFAEGTSARTFm^ 

LCGQKERLKFEGLCCHPFFSKIDWNNIRNAPPPFVPTLKSDDDTSNFDEPEKNSWVSSSPCQLSPSGF 

SGEELPFVGFSYSKALGILGRSESWSGLDSPAKTSSMEKKLLIKSKEIiQDSQD^ 

RVSEVEAVLSQKEVELKASETQRSLLEQDLATYITECSSLK^ 

SRKLQEIKEQEYQAQVEEMRLMMNQLEEDLVSARRRSDLYESEI^ 

KDQGKPEVGEYAKLEKINAEQQLKIQELQEKLEKAV^ 

SSEGIRKKLVEAEERRHSLENKVKRLETMERRENRLKDDIO 



99 



WO 03/029424 



PCT/US02/31373 



qhlevi^^ 

sleqrivelseanklaansslftqrnmkaqeemiselrqqkfyle^^^ 

dhsdknrlleletriirevsleheeqklelkrqltelq 

eettaeaeeeiqaltahrdeiqrkfdalrnscotito^ 

ANDEIVQLRSEVDHIjRREITEREMQLTSQKQTMEAIjKTTCTMLEEQVMD 

SVIXSDEKSQFEOTYRELQKl^OTEKQSI^RADQRITE 

SLSDKLKTOLEKKHAmEMNARSL 

ADLLKTERSDLEYQLENIQVLYSHEKVKMEGTISQQT^ 
KEKARCAELEEALQKTRII^RSAREEAAHRKATDHPH^ 

PPS SRRKES STPEEFSRRLKERMHHNI PHRFOTGLNMRATKCAVCLDT VHFGRQASKCLECQVMCHPK 
C S TCL P ATCGLPAEYATHFTEAFCRDKMNS PGLQTKEPS S SLHLEGWMKVPRNNKRGQQGWDRKYIVL 
EGSKVTiIYDNEAREAGQRPVEEFELCLPTODVSIHGAVGASELAOTA 
RTLYLLAPSFPDKQRWVTALESVYAGGRVSREKAEADAKLL^ 

VGTEEGLYALNVLKNSLTHVPGIGAWQIYIIKDLEKLI^IAGEERALCLVDVKKWQSLAQSHLPAQ 
PDI SPNIFEAVKGCHLFGAGKIENGLC ICAAMPSKWILRYNENLSKYC IRKEIETSEPCSCIHFTNY 
SILIGTNKFYEIDMKQYTXiEEFLDKOTHSriAPAWAASSNSFPVSIVQVNSAGQREEYIiCFHEFGW 
VDSYGRRSRTDDLKWSRLPLAFAYREPYLFVTHFNSLEVIEIQARS SAGTPARAYLDI PNPRYLGPAI 
SS GAI YLAS SYQDKLRVICCKGNLVKESGTEHHRGPS TSRS S PNKRGP P T YNEH I TKRVASS PAPPEG 
PSHPREPSTPHRYREGRTELRRDKSPGRPLiSREKSPGRMLSTRRERSPGRLFEDSSRGRLPAGAVRTP 
LSQVNKVWDQSSV 





SEQIDNO:3 jl870bp f 


NOVlb, 
268667493 
DNA Sequence 


CACCGGTACCACCATGTTGAAGTTCAAATATGGAGCGCGGAATCCTTTGGATGCTGGTGCTGCTGAA 
CCCATTGCCAGCCGGGCCTCCAGGCTGAATCTGTTCTTCCAGGGGAAACCACCCTTTATGACTCAAC 
AGGAGATGTCTCCTCTTTCCCGAGAAGGGATATTAGATGCCCTC 

TCAGCCTGCTCTGATGAAGATTAAGCACGTGAGCAACTTTGTCCGGAAGTATTCCGACACCATAGCT 
GAGTTACAGGAGCTCCAGCCTTCGGCAAAGGACTTCGAAGTC^^ 

TTGC TGAAGTGCAGG TGGTAAG AG AGAAAG C AACCGGGG AC ATC T ATGC TATGAAAGTGATGAAGAA 
GAAGGCTTTATTGGCCCAGGAGCAGGTTTCATTTTTTGAGGAAGAGCGGAACATATTATCTCGAAGC 
ACAAGCCCGTGGATCCCCCAATTACAGTATCCCTTTCAGGACAAAAATCACCTTTATCTGGTCATGG 
AATATCAGCCTGGAGGGGACTTGCTGTCACTTTTGAATAGATATGAGGACCAGTTAGATGAAAACCT 
GATAC^GTTTTACCTAGCTGAGCTGATTTTGGCTGTTCACAGCGTTCATCTGATGGGATACGTGCAT 
CG AGACATC AAGCCTGAGAACATTC TC GT TGACCGC ACAGGAC AC ATC AAGC TGGT GGATTTT GGAT 
CTGCCGCGAAAATGAATTCAAACAAGATGGTGAATGCCAAACTCCCGATTGGGACCCCAGATTACAT 
GGCTCCTGAAGTGCTGACTGTGATGAACGGGGATGGAAAAGGCACCTACGGCCTGGACTGTGACTGG 
TGGTCAGTGGGCGTGATOGCCTATGAGATGATTTATGGGAGATCCCCCTTCGCAGAGGGAACCTCTG 
CCAGAACCTTCAATAACATTATGAATTTCCAGCGGTTTTTGAAATTTC 

C AGTGAC TTTCTTGATC TG AT TCAAAGC TTG TTG TGC GGC CAG AAAGAGAGAC TGAAGTT TGAAGGT 
CTTTGCTGCCATCCTTTCTTCTCTAAAATTGACTGGAACAACATTCGTAACTCTCCTCCCCCCTTCG 
TTCCCACCCTCAAGTCTGACGATGACACCTCCAATTTTGATGAACCAGAGAAGAATTCGTGGGTTTC 
ATCCTCTCCGTGCCAGCTGAGCCCCTCAGGCTTCTCGK^TGAAGAACTGCCGTTTGTGGGGTTTTC^ 
TACAGCAAGGC^CTGGGGATTCTTGGTAGATCTGAGTCTGTTGTGTCGGGTCTGGACTCCCCTGCCA 
AGACTAGCTCCATGGAAAAGAAACTTCTCATCAAAAGCAAAGAGCTACAAGACTCTCAGGACAAGTG 
TCACAAGATGGAGCAGGAAATGACCCGGTTACATCGGAGAGTGTCAGAGGTGGAGGCTGTGCTTAGT 
CAGAAGGAGGTGGAGCTGAAGGCCTCTGAGACTCAGAGATCCCTCCTGGAGCAGGACCTTGCTACCT 
ACATCACAGAATGCAGTAGCTTAAAGCGAAGTTTGGAGCAAGCACGGATGGAGGTGTCCCAGGAGGA 
TGACAAAGC ACTGCAGCTTCTCCATGATATC AGAGAGCAGAGC CGGAAGC TCCAAGAAATCAAAGAG 
CAGGAGTACCAGGCTCAAGTGGAAGAAATGAGGTTGATGATGAATCAGTTGGAAGAGGATCTTGTCT 
CAGCAAGAAGACGGAGTGATCTCTACGAATCTGAGCTGAGAGAGTCTCGGCTTGCTGCTGAAGAATT 
C AAGCGGAAAGCG ACAGAATGTCAGCATAAAC TGTTGAAGGC TAAGG ATC AGGTCGACGGC 




ORF Start: at 2 joRF Stop: end' of sequence 





SEQ]DNO:4 


623 aa 


MWat70970.0kD 


NOVlb, 
268667493 
Protein 
Sequence 


TGTTMLKFKYGARNPLDAGAAEPIASRASRI^FFQGKPPFMTQQQMSPLSREGILDALFVLFEECS 
QPALMKIKHVSNFVRKYSDTIAELQELQPSAKD^ 

ICALLAQEQVSFFEEERNILSRSTSPWIPQLQYAFQDKNHLYLVMEYQPGGDLLSLLNRYEDQLDE 
IQFYLAELILAVHSVHLMGYVHRDIK^^ 

APEVLTVMNGDGKGTYGLDCDIWSVGVIAYEMIYGRSPFAEGTSARTFNNIMOTQRFLKFPDDPKVS 
SDFLDLIQSLLCGOKERLKFEGLCCHPFFSKIDWNNIRNSPPPFVPTLKSDDDTSNFDEPEKNSWVS 
SSPCQLSPSGFSGEELPFVGFSYSKALGILGRSESWSGLDSPAKTSSMEKKLLIKSKELQDSQDKC 
HKMEOEMTRLHRRVSEVEAVLSOKEVELKASETORSLLEODLAT YI TEC S SLKRSLEOARMEVSOED 



100 



WO 03/029424 



PCT/US02/31373 



DKALQLLHDIREQSRKLQEIKEQETtfQAQVEE 

KRKATECQHKLLKAKDQVDG 





SEQIDNO:5 |2497bp j 


NOVlc, 
268667539 
DNA Sequence 


CACCGGTACCCAGGGGAAGCCTGAAGTGGGAGAATATGCGAAACTGGAGAAGATCAATGCTGAGCAGC 
AGCTCAAAATTCAGGAGCTCCAAGAGAAACTGGAGAAGGCTGTAAAAGCCAGCACGGAGGCCACCGAG 
CTGCTGCAGAATATCCGCCAGGCAAAGGAGCGAGCCGAGAGGGAGCTGGAGAAGCTGCAGAACCGAGA 
GGATTCTTCTGAAGGCATCAGAAAGAAGCTGGTGGAAGCTGAGGAACGCCGCCATTCTCTGGAGAACA 
AGGTAAAGAGACTAGAGACCATGGAGCGTAGAGAAAACAGACTGAAGGATGACATCCAGACAAAATCC 
CAACAGATCCAGCAGATGGCTGATAAAATTCTGGAGCTCGAAGAGAAACATCGGGAGGCCCAAGTCTC 
AGCCCAGCACCTAGAAGTGCACCTGAAACAGAAAGAGCAGCACTATGAGGAAAAGATTAAAGTGTTGG 
ACAATCAGATAAAGAAAGACCTGGCTGACAAGGAGACACTGGAGAACATGATGCAGAGACACGAGGAG 
GAGGCCCATGAGAAGGGCAAAATTCTCAGCGAACAGAAGGCGATGATCAATGCTATGGATTCCAAGAT 
CAGATCCCTGGAACAGAGGATTGTGGAACTGTCTGAAGCCAATAAACTTGCAGCAAATAGCAGTCTTT 
TTACCCAAAGGAACATGAAGGCCCAAGAAGAGATGATTTCTGAACTCAGGCAACAGAAATTTTACCTG 
GAGACTCAGGCTGGGAAGTTGGAGGCCCAGAACCGAAAACTGGAGGAGCAGCTGGAGAAGATCAGCCA 
C CAAGAC CAC AGTG AC AAG AATC GGC TGC TGGAAC TGGAG ACAAG ATTGC GG G AGGTC AG TC TAGAGC 
ACGAGGAGCAGAAACTGGAGCTCAAGCGCCAGCTCACAGAGCTACAGCTCTCCCTGCAGGAGCGCGAG 
TCACAGTTGACAGCC C TGC AGGCTGCACGGGCGGCCC TGGAGAGCCAGCTTCGCCAGGCGAAGACAGA 
GCTGGAAGAGACCACAGCAGAAGCTGAAGAGGAGATCCAGGCACTCACGGCACATAGAGATGAAATCC 
AGCGCAAATTTGATGCTCTTCGTAACAGCTGTACTGTAATCACAGACCTGGAGGAGCAGCTAAACCAG 

L. lXxAL.C0At3QjAL~AAL.v3L. IajAAL I UWLAALLAAAAu 1 i.L 1 1\t AvLAAALAAV, LtALtLtL. AIL 

TGGCGCCAACGACGAGATTGTACAACTGCGAAGTGi^GTGGACCATCTCCGCCGGGAGATCACGGAAC 

GAGAGATGCAGCTTACCAGCCAGAAGCAAACGATGGAGGCTCTGAAGACCACGTGra 

GAACAGGTCATGGATTTGGAGGCCCTAAACGATGAGCTGCTAGAAAAAGAGCGGCAGTGGGAGGCCTG 

GAGGAGCGTCCTGGGTGATGAGAAATCCCAGTTTGAGTGTCGGGTTCGAGAGCTGCAGAGAATGCTGG 

ACACCGAGAAACAGAGCAGGGCGAGAGCCGATCAGCGGATCACCGAGTC TCGC CAGGTGGTGGAGCTG 

GCAGTGAAl^AGGaCAAGGCTC^GATTCTCGCTC 

C GAG AGCC TC TC TG AC AAGC TC AATGAC CTGGAGAAG AAGCATGC TATGC T TG AAATGAATGC C CGAA 

GCTTACAGCAGAAGCTGGAGACTGAACGAGAGCTCAAACAGAGGCTTCTGGAAGAGCAAGCCAAATTA 

CAGCAGC AGATGG AC C TGCAGAAAAATC AC ATT T TCC GTC TGAC TCAAGG AC TGC AAGAAGCTC TAG A 

TCGGGCTGATCTACTGAAGACAGAAAGAAGTGACTTGGAGTATCAGCTGGAAAACATTCAGGTTCTCT 

ATTC TC ATGAAAAGGTGAAAATGGAAGGCAC TATTTC TC AACAAACC AAAC TC ATTGATTT TC TGC AA 

GCCAAAATGGACCAACCTGCTAAAAAGAAAAAGGTTCCTCTGCAGTACAATGAGCTGAAGCTGGCCCT 

GGAGAAGGAGAAAGCTCGCTGTGCAGAGCTAGAGGAAGCCCTTCAGAAGACCCGCATCGAGCTCCGGT 

CCGCCCGGGAGGAAGCTGCCCACCGCAAAGCAACGGACCACCCACACCCATCCACGCCAGCCACCGCG 

AGGi^GCAGATCGCC^TGTCCGCC^TCGTGCGGTCGCCAGAGCACCAGCCCAGTGCCATGAGCCTGCT 

GGCCCCGCCATCCAGCCGCAGAAAGGAGTCTTGAACTCCAGAGGAATTTAGTCGGCGTCTTAAGGAAC 

GCATGCACCACAATATTCCTCACCGATTCAACGTAGGACTGAAC ATGCGAGC CACAAAGTGTGC TGTG 

TGTCTGGATACCGTGCACTTTGGACGCCAGGCATCCAAATGTCTCGAATGTCAGGTGATC 

GAAGTGCTCCACGTGCTTGCCAGCCACCTGCGGCTTGCCTGTCGACGGC 




ORF Start: at 2 JORF Stop: end of sequence 





SEQ ID NO: 6 |832 aa |MW at 96885.8kD 


NOVlc, 
268667539 
Protein 
Sequence 


TGTQGKPEVGEYAKLEKINAEQQLKIQELQEKLEKAVKASTEATELLQNIRQAKERAERELEKLQNRE 
DSSEGIRKKLVEAEERRHSLENKVIG^LETMERRENRLKDDIQTKSQQIQQMA^ 

AQHLEVHLKQKEQHYEEKIKVLDNQ IKKDL ADKETLENMMQRHEEEAHEKGK I L S EQKAMINAMDSK I 

RSLEQRIVELSEANKLAANS SLFTQRNMKAQEEMI SELRQQKFYLETQAGKLEAQNRKLEEQLEKI SH 

QDHSDKimiLELETRLREVSLEHEEQKLELKRQLTELQLSLQERESQLTALQAARAALESQLRQ 

LEETTAEAEEEIQALTAHRDEIQRKFDALRNSCWITDLEEQLNQLTEDNAELNNQOT 

GANDEIVQLRSEVDHLRREITEREMQLTSQKQM 

RSVLGDEKSQFECRVRELQRMI^TEKQSRARADQRITESRQVV^ 

ESLSDKLITOLEKKHAMLEMNARSLQ^ 

RADLLKTERSDLEYQLENIQVLYSHEKV^^ 

EKEKARCAELEEALQKTRIELRSAREEAAHRKATDHPHPSTPATARQQIAMSAIVRSPEHQPSAMSLL 

APPSSRRKESSTPEEFSRRLKERMHHNIPHRFOTGLxMRATKCAVCL^ 

KCSTCLPATCGLPVDG 



101 



« f 

WO 03/029424 



PCT/US02/31373 



NOVld, 
268667543 • 
DNA Sequence 



SEQIDNO:7 



2542 bp 



CAC CGGT ACC C AG GGG AAGC CTG AAG TGGGAG AATAT G CGAAAC TGG AGAAG ATC AATGCTG AG C AG 

CAGC TCAAAATTC AGGAGCTCCAAGAGAAACTGG AGAAGGCTGTAAAAGCC AGC ACGGAGGC CACCG 

AGCTGCTGCAGAATATC CGCCAGGC AAAGGAGCGAGCCGAGAGGGAGCTGGAGAAGC TGCAGAACCG 

AGAGGATTCTTCTGAAGGCATCAGAAAGAAGCTGGTGGAAGCTGAGGAACGCCGCCATTCTCTGGAG 

AACAAGGTAAAGAGACTAGAGACCATGGAGCGTAGAGAAAACAGACTGAAGGATGACATCCAGACAA 

AATCCCAACAGATCCAGCAGATGGCTGATAAAATTCTGGAGCTCGAAGAGAAACATCGGGAGGCCCA 

AGTC TCAG C CCAGCACC TAG AAGTGCAC CTGAAACAG AAAGAGCAGCAC TATG AGG AAAAGATTAAA 

GTGT TGGAC AATCAGATAAAGAAAG AC C TGGC TG ACAAGGAGACACTGGAGAAC ATGATG C AGAG AC 

ACGAGGAGGAGGCCCATGAGAAGGGCAAAATTCTCAGCGAACAGAAGGCGATGATCAATGCTATGGA 

TTCCAAGATCAG ATCC C TGGAACAG AGG ATTGTGG AAC TGTCTG AAGC C AAT AAACT TGC AG CAAAT 

AGCAGTCTTTTTACCCAAAGGAACATGAAGGCCCAAGAAGAGATGATTTCTGAACTCAGGCAACAGA 

AATTTTACCTGGAGACACAGGCTGGGAAGTTGGAGGCCCAGAACCGAAAACTGGAGGAGCAGCTGGA 

GAAGATCAGCGACCAAG ACCAC AGTGAC AAGAATCGGC TGCTGGAAC TGGAGAC AAGATTGCGGGAG 

GTCAGTCTAGAGCACGAGGAGCAGAAACTGGAGCTCAAGCGCCAGCTCACAGAGCTACAGCTCTCCC 

TGCAGGAGCGCGAGTCACAGTTGACAGCCCTGCAGGCTGCACGGGCGGCCCTGGAGAGCCAGCTTCG 

CCAGGCGAAGACAGAGCTGGAAGAGACCACAGCAGAAGCTGAAGAGGAGATCCAGGCACTCACGGCA 

CATAGAGATGAAATCCAGCGCAAATTTGATGCTCTTCGTAACAGCTGTACTGTAATCACAGACCTOT 

AGGAGCAGCTAAACCAGCTGACCGAGGACAACGCTGAACTCAACAACCAAAACTTCTACTTGTCCAA 

ACAACTCGATGAGGCTTCTGGCGCCAACGACGAGATTGTACAACTGCGAAGTGAAGTGGACCATCTC 

CGCCGGGAGATCACGGAACGAGAGATGCAGCTTACCAGCCAGAAGCAAACGATGGAGGCTCTGAAGA 

CCACGTGCACCATGCTGGAGGAACAGGTCATGGATTTGGAGGCCCTAAACGATGAGCTGCTAGAAAA 

AGAGCGGCAGTGGGAGGCCTGGAGGAGCGTCCTGGGTGATGAGAAATCCCAGTTTGAGTGTCGGGTT 

CGAGAGCTGCAGAGGATGCTGGACACCGAGAAACAGAGCAGGGCGAGAGCCGATCAGCGGATCACCG 

AGTCTCGCCAGGTGGTGGAGCTGGCAGTGAAGGAGCACAAGGCTGAGATTCTCGCTCTGCAGCAGGC 

TCTCAAAGAGCAGAAGCTGAAGGCCGAGAGCCTCTCTGACAAGCTCAATGACCTGGAGAAGAAGCAT 

GCTATGCTTGAAATGAATGCCCGAAGCTTACAGCAGAAGCTGGAGACTGAACGAGAGCTCAAACAGA 

GGCTTCTGGAAGAGCAAGCCAAATTACAGCAGCAGATGGACCTGCAGAAAAATCACATTTTCCGTCT 

GACTCAAGGACTGCAAGAAGCTCTAGATCGGGCTGATCTACTGAAGACAGAAAGAAGTGACTTGGAG 

TATC AGC TGGAAAACATTCAGGTTCT CTAT TC TCATGAAAAGGTG AAAATGGAAGGCAC T ATTTCTC 

AACAAACCAAACTGATTGATTTTCTGCAAGCCAAAATGGACCAACCTGCTAAAAAGAAAAAGGGTTT 

ATTTAGTCGACGGAAAGAGGACCCTGCTTTACCCACACAGGTTCCTCTGCAGTACZVATGAGCTC 

CTGGCCCTGGAGAAGGAGAAAGCTCGCTGTGCAGAGCTAGAGGAAGCCCTTCAGAAGACCCGCATCG 

AGCTCCGGTCCGCCCGGGAGGAAGCTGCCCACCGCAAAGCAACGGACCACCCACACCCATCCACGCC 

AGCCACCGCGAGGCAGCAGATCGCCATGTCTGCCATCGTGCGGTCGCCAGAGCACCAGCCCAGTGCC 

ATGAGCCTGCTGGCCCCGCCATCCAGCCGC AGAAAGGAGTC TTCAAC TCCAGAGGAATTTAGTCGGC 

GTCTTAAGGAACGCATGCACCACAATATTCCTCACCGATTCAACGTAGGACTGAACATGCGAGCCAC 

AAAGTGTGCTGTGTGTCTGGATACCGTGCACTTTGGACGCG 

GTGATGTGTCACCCCAAGTGCTCCACGTGCTTGCCAGCCACCTGCGGCTTGCCTGTCGACGGC 

ORFStart:at2 (ORF Stop: end of sequence 



NOVld, 
268667543 
Protein 
Sequence 



SEQIPNO:8 



847 aa 



lMWat98582.7kD 



TGTQGKPEVGEYAKLEKINAEQQLKIQELQEKLEKAVKASTEATELLQNIRQAKERAERELEKLQNR 

EDSSEGIRKKLVEAEERRHSLENKVKRLETMERREN^ 

VSAQHLEVHLKQKEQHYEEKIKVLDNQIKKDI^^ 

SKIRSLEQRIVELSEANKLAANSSLFTQRNMKAQEEMISELRQQKFYLETQA 
KISHQDHSDKNEUIjIiELETRLREVSLEHEEQ 

QAKTELEETTAEAEEEIQALTAHRDEIQRKFDALRNSCOTITDL^ 

QLDEASGANDEIVQLRSEVDHLRREITEREMQLTSQKQTMEALKTTCTmEEQVl^LEALNDELLEK 

ERQWEAWRSVLGDEKSQFECRVRELQRMLDTEKQSRARADQRITESRQVVELAVKEHKAEIL^ 

LKEQKLKAESLSDKLOTLEKKHAMLEMNARS 

TQGLQEALDRM5LLKTERSDLEYQLENIQVLYSHEKVKMEGTISQQTKIiIDFLQAKM 
FSRRKEDPALPTQVPLQYZsTEXKLALEKEKAR 

ATARQQIAMSAITOSPEHQPSAMSMiAPPSSRRKESSTPEEFSRRLKERNffflNIPHRFNVGLNMRAT 
KCAVCLDTVHFGRQASKCLECQVMCHPKCSTCLPATCGLPVDG ' 



10 



ISEQIDNO:?" 



1870 bp 



102 



WO 03/029424 



PCT/US02/31373 



NOVle, 
268667555 
DNA Sequence 


CACCOTTACCTGCGGCTTGCCTGCTGAAT^ 

TGAACTCCCCAGGTCTCCAGACCAAGGAGCCCAGCAGCAGCTTGCACCTGGAAGGGTGGATGAAGGTG 
CCCAGGAATAACA7VACGAGGACAGCAAGGCTGGGACAGGAAGTACATTGTCCTGGAGGGATCAAAAGT 
C CTC ATTTATG ACAATGAAGCCAGAGAAGCTGGACAGAGGC CGGTGGAAGAATTTGAGCTGTGCCTTC 
CCGACGGGGATGTATCTATTCATGGTGCCGTTGGTGCTTCCGAACTCGCAAATACAGCCAAAGCAGAT 
GTCCCATACATACTGAAGATGGAATCTC AC CCGCACAC CAC CTGCTGGCCCGGGAGAACCCTC TACTT 
GCTAGCTCCCAGCTTCCCTGACAAACAGCGCTGGGTCACCGCCTTAGAATCAGTTGTCGCAGGTGGGA 
GAGTTTCTAGGGAAAAAGCAGAAGCTGATGCTAAACTGCTTGGAAACTCCCTGCTGAAACTGGAAGGT 
GATGACCGTCTAGACATGAACTGCACGCTGCCCTTCAGTGACCAGGTGGTGTTGGTGGGTACCGAGGA 
AGGGCTCTACGCCC TGAATGTCTTG AAAAAC TCC CTAACCCATGTCCCAGGAATTGGAGCAGTCTTCC 
AAATTTATATTATCAAGGACCTGGAGAAGCTACTCATGATAGCAGGAGAAGAGCGGGCACTGTGTCTT 
GTGGACGTGAAGAAAGTGAAACAGTCCCTGGCCCAGTCC(^CCTGCCTGCCCAGCCCGACATCTCACC 
CAACATTTTTGAAGCTGTCAAGGGCTGCCACTTGTTTC 

TCTGTGCAGCCATGCCCAGCAAAGTCGTCATTCTCCGCTACAACGAAAACCTCAGCAAATACTGCATC 
CGGAAAGAGATAGAGACCTC^GAGCCCTGCAGCTGTATCC^CTT 

AACCAATAAATTCTACGAAATCGACATGAAGCAGTACACGCTCGAGGAATTCCTGGATAAGAATGACC 
ATTCCTTGGCACCTGCTGTGTTTGCCGCCTCTTCCAACAGCTTCCCTGTCTCAATCGTGCAGGTGAAC 
AGCGCAGGGCAGCGAGAGGAGTACTTGCTGTGTTTCCACGAATTTGGAGTGTTCGTGGATTCTTACGG 
AAGACGTAGCCGCAC AG ACGATCT CAAGTGGAGTCGCT TACCTTTGGC CTTTGCC TACAGAGAACCC T 
ATCTGTTTGTGACCCACTTCAACTCACTCGAAGTAATTGA 

CCTGCCCGAGCGTACCTGGACATCCCGAACCCGCGCTACCTGGGCCCTGCCATTTCCTCAGGAGCGAT 
TTACTTGGCGTCCTCATACCAGGATAAATTAAGGGTCATTTGCTGCAAGGGAAACCTCGTGAAGGAGT 
CCGGCACTGAACACCACCGGGGCCCGTCCACCTCCCGCAGCAGCCCGAAG^ 

TACAACGAGCACATCACCAAGCGCGTGGCCTCCAGCCCAGCGCCGCCCGAAGGCCCCAGCCACCCGCG 
AGAGCCAAGCACACCCCACCGCTACCGCGAGGGGCGGACCGAGCTGCGCAGGGACAAGTCTCCTGGCC 
GCCCCCTGGAGCGAGAGAAGTCCCCCGGCCGGATGCTCAGCACGCGGAGAGAGCGGTCCCCCGGGAGG 
CTGTTTGAAGACAGCAGCAGGGGCCGGCTGCCTGCGGGAGCCGTGAGGACCCCGCTGTCCCAGGTGAA 
CAAGGTGTGGGACCAGTCTTCAGTAGTCGACGGC 




ORF Start: at 2 JoRF Stop: end of sequence 





SEQ ID NO: 10 |623 aa JMW at 69278.9kD 


NOVle, 
268667555 
Protein 
Sequence 


TGTCGJjPAEYATHFTEAFCRDKMNS PGLQTKE PS SSLHLEGWMKVPRNNKRGQQGWDRKYrVIiEGSKV 

LIYDNEAREAGQRPVEEFELCLPDGDVSlHGAVGASEIiANTAKADVPYILKMESHPH 

LAPSF PDKQRWVTALES WAGGRVSREKAEADAKLLGNSLLKLEGDDRLDMNCTL PFSDQWIjVGTEE 

GLYALNVLKNSLTHVPGIGAVFQIYIIKDLEKLLMIAGEERALCLVDVKKVKQ 

NIFEAVKGCHLFGAGKIENGLCICAAMPSKYV^ 

TNKFYEIDMKQYTLEEFIiDKNDHSLAPAVFAAS SNSFPVS IVQVNSAGQREEYLI#CFHEFGVFVDSYG 
RRSRTDDLKWSRLPIAFAYREPYLFVTHFNSLEVIEIQARSSAGTPARAYLDIPNPRYLGPAISSGAI 
YLASS YQDKLRVICCKGNLVKESGTEHHRG PSTSRS S PNKRGPPTYNEHITKRVASS PAPPEGPSHPR 
EPSTPHRYREGRTELRRDKSPGRPLEREK S PGRMLSTRRER S PGRLFEDS SRGRIiPAGAVRT PLSQVN 
KVWDQSSWDG 





SEQIDNO: 11 |l915bp 




NOVlf, 
268667574 
DNA Sequence 


CACCGGTACCTGCGGCTTGCCTGCTGAATATGCCACACACTTCACCGAGGCCTTCTGCCGTGATAAAA 
TGAACTCCCCAGGTCTCCAGACCAAGGAGCCCAGCAGCAGCTTGCACCTGGAAGGGTGGATGAAGGTG 
C C C AGG AATAAC AAAC GAGGAC AGC AAGGC TG GGAC AGG AAG T ACAT TGTC C TGG AGGGATCAAAAGT 
CCTCATTTATGACAATGAAGCCAGAGAAGCTGGACAGAGGCCGGTGGAAGAATTTGAGCTGTGCCTTC 
CCGACGGGGATGTATCTATTCATGGTGCCGTTGGTGCTTCCGAACTCGCAAATACAGCCAAAGCAGAT 
GTCCCATACATACTCAAGATGGAATCTCACCCGCAC^^ 

GCTAGCTCCCAGCTTCCCTGACAAACAGCGCTGGGTCACCGCCTTAGAATCAGTTGTCGCAGGTGGGA 
GAGTTTCTAGGGAAAAAGCAGAAGCTGATGCTGCCCGCGACTGTGTTTCTTACGAGCTTCTGCCTGCC 
TGGGTTCAGAAACTGCTTGGAAACTCCCTGCTGAAACTGGAAGGTGATGACCGTCTAGACATGAACTG 
CACACTGCCCTTCAGTGACCAGGTGGTGTTGGTGGGCACCGAGGAAGGGCTCTACGCCCTGAATGTCT 
TCAAAAACTCCCTAACCCATCTCCCAGGAATTGGAGCAGTCTTCCAAATTTATATTATCAAGGACCTG 
GAGAAGCTACTCATGATAGCAGGAGAAGAGCGGGCACTGTGTCTTGTGGACGTGAAGAAAGTGAAACA 
GTCCCTGGCCCAGTCCCACCTGCCTGCCCAGCCCGACATCTCACCCAACATTTTTGAAGCTGTCAAGG 
GCTGCCACTTGTTTGGGGC^GGCAAGATTGAGAACGGGCTCTGCATCTGTGCAGCCATGCCCAGCAAA 
GTCGTCATTCTCCGC TACAACGAAAACC TC AGCAAATAC TGCATCCGGAAAGAGATAGAGACCTCAGA 
GCCCTGCAGCTGTATCCACTTCACCAATTACAGTATCCTCATTGGAACCAATAAATTCTACGAAATCG 
ACATGAAGC AGTACACGC TCGAGGAATTCC TGGATAAGAATGACCATTC CTTGGCACCTGCTGTGTTT 
GCCGCCTCTTCCAACAGCTTCCCTGTCTCAATCGTGCAGGTGAACAGCGCAGGGCAGCGAGAGGAGTA 



103 



WO 03/029424 



PCT/US02/31373 





CTTGCTGTGTTTCCACGAATTTGGAGTGTO 

TCAAGTGGAGTCGCTTACCTTTGGCCTTTGCCTACAGAGAACCCTATCTGTTTGTGACCCACTTCAAC 
TCAC TCGAAGTAATTGAGATCCAGGCACGCTC CTCAGCAGGGACCCCTGCCCGAGCGTACC TGGACAT 
CCCGAACCCGCGCTACCTGGGCCCTGCCATTTCCTCAGGAGCGATTTACTTGGCGTCCTCATACCAGG 
ATAAATTAAGGGTCATTTGCTGCAAGGGAAACCTCGTGAAGGAGTCCGGCACTGAACACCACCGGGGC 
CCGTCCACCOX:CCGCAGCAGCCC(^C^GCGAGGCCCACCC^CGTACAACGAGCACATCACCAAGCG 
CGTGGCCTCCAGCCCAGCGCCGCCCGAAGGCCCCAGCCACCCGCGAGAGCCAAGCACACCCCACCGCT 
ACCGCGAGGGGCGGACCGAGCTGCGCAGGGACAAGTCTCCTGGCCGCCCCCTGGAGCGAGAGAAGTCC 
CC CGGCCGGATGCTCAGCACGCGGAGAGAGCGGTCCCCCGGGAGGC TGTTTGAAGACAGCAGCAGGGG 
CCGGCTGCCTGCGGGAGCCGTGAGGACCCCGCTGTCCCAGGTGAACAAGGTGTGGGACCAGTCTTCAG 
TAGTCGACGGC 




ORFStart:at2 


ORF Stop: end of sequence 



|SEQ ID NO: 12 |638aa 


|MW at71010.8kD 


NOVlf, 
268667574 
Protein 
Sequence 


TGTCGLPAEYATHFTEAFCRDKMNSPGLQTKEPSSSLHLEGWMKVPRNN^ 
LIYDNEAREAGQRFVEEFELCLPIXSDVSIHGAVGA 

LAP SF PDKQR WTAL ES WAGGRVSREKAEADAARIX^ VS YELL P AOTQKLLGNS LL KLEGDDRLDMNC 
TLPFSDQVVLVGTEEGLYAIJWLKNSLTHVPGIGAVFQIYII 

SLAQSHLPAQPDISPNIFEAVKGCTLFGAGKIENGLCICAAMPSKWILRYNENLSKYCIRKE 

PCSCIHFTNYSILIGTNKFYEIDMKQYTLEEFLDKNDHSIA 

LLCFHEFGVFVTDSYGRRSRTODLKWSRLPLAFAYREPYLFVTHFNSLEV 

PNPRYLGPAISSGAIYLASSYQDKLRVICCKGNLVKESGTEHHRGPSTSRSSPNKRGPPTYNEHITKR 
VASSPAPPEGPSHPREPSTPHRYREGRTELRRDKS PGRPLEREKS PGRMLSTRRERSPGRLFEDSSRG 
RLPAGAVRTPLSQVNKVWDQSSWDG 





SEQIDNO:13 16201 bp | 


NOVlg, 
CG106764-02 
DNA Sequence 


ATGTTGAAGTTCAAATATGGAGCGCGGAATCCTTTGGATGCTGGTGCTGCTGAACCCATTGCCAGCC 
GGGCCTCCAGGCTGAATCTGTTCTTCCAGGGGAAACCACCCTTTATGACTCAACAGCAGATGTCTCC 
TCTTTCCCGAGAAGGGATATTAGATGCCCTCTTTGTTCTCTTTGAAGAATGCAGTCAGCCTGCTCTG 
ATGAAGATTAAGCACGTGAGCAACTTTGTCCGGAAGTGTTCCGACACCATAGCTGAGTTACAGGAGC 
TC CAGCC TTCGGCAAAGGAC TTCGAAGTCAGAAGTCTTGTAGGTTGTGGTCACT TTGCTGAAGTGCA 
GGTGGTAAGAGAGAAAGCAACCGGGGACATCTATGCTATGAAAGTGATGAAGAAGAAGGCTTTATTG 
GCCCAGGAGCAGGTTTCATTTTTTGAGGAAGAGCGGAACATATTATCTCGAAGCACAAGCCCGTGGA 
TCCCCCAATTACAGTATGCCTTTCAGGACAAAAATCACCTTTATCTGGTGATGGAATATCAGCCTGG 
AGGGGACTTGCTGTCAC TTT TGAATAGATATGAGGACCAGTTAG ATGAAAACCTGATACAGTTTTAC 
CTAGCTGAGCTGATTTTGGCTGTTCACAGCGTTCATCTGAT^ 

CTGAGAACATTCTCGTTGACCGCACAGGACACATCAAGCTGGTGGATTTTGGATCTGCCGCGAAAAT 
GAATTCAAACAAGGTGAATGCCAAACTCCCGATTGGGACCCCAGATTACATGGCTCCTGAAGTGCTG 
ACTGTGATGAACGGGGATGGAAAAGGCACCTACGGCCTGGACTGTGACTGGTGGTCAGTGGGCGTGA 
TTGCCTATGAGATGATTTATGGGAGATCCCCCTTCGCAGAGGGAACCTCTGCCAGAACCTTCAATAA 
CATTATGAATTTCCAGCGGTTTTTGAAATTTCCAGATGACCCCAAAGTGAGCAGTGACTTTCTTGAT 
CTGATTC AAAGC TTGTTGTGCGGC C AGAAAGAG AGAC TGAAGTTTGAAGGTC TTTGCTGCC ATCC TT 
TCTTCTCTAAAATTGACTGGAACAACATTCGTAACGCTCCTCCCCCCTTCGTTCCCACCCTCAAGTC 
TGACGATGACACCTCCAATT TTGATGAAC CAGAGAAGAATTCGTGGGTTTCATCCTCTCCGTGCCAG 
CTGAGCCCCTCAGGCTTCTCGGGTGAAGAACTGCCGTTTGTGGGGTTTTCGTACAGCAAGGCACTGG 
GGATTCTTGGTAGATCTGAGTCTGTTGTGTCGGGTCTGGACTCCCCTGCCAAGACTAGCTCCATGGA 
AAAGAAACTTCTCATCAAAAGCAAAGAGCTACAAGACTCTCAGGACAAGTGTCACAAGATGGAGCAG 
GAAATGACCCGGTTACATCGGAGAGTGTCAGAGGTGGAGGCTGTGCTTAGTCAGAAGGAGGTGGAGC 
TGAAGGCCTCTGAGACTCAGAGATCCCTCCTGGAGCAGGACCTTGCTACCTACATCACAGAATGCAG 
TAGCTTAAAGCGAAGTTTGGAGCAAGCACGGATGGAGGTGTCCCAGGAGGATGACAAAGCACTGCAG 
CTTCTCCATGATATCAGAGAGCAGAGCCGGAAGCTCCAAGAAATCAAAGAGCAGGAGTACCAGGCTC 
AAGTGGAAGAAATGAGGTTGATGATGAATCAGTTGGAAGAGGATCTTGTCTCAGCAAGAAGACGGAG 
TGATCTCTACGAATCTGAGCTGAGAGAGTCTCGGCTTGCTGCTGAAGAATTCAAGCGGAAAGCGACA 
GAATGTCAGCATAAACTGTTGAAGGCTAAGGATCAGGGGAAGCCTGAAGTGGGAGAATATGCGAAAC 
TGGAGAAGATCAATGCTGAGCAGCAGCTCAAAATTCAGGAGCTCCAAGAGAAACTGGAGAAGGCTGT 
AAAAGCCAGCACGGAGGCCACCGAGCTGCTGCAGAATATCCGCCAGGCAAAGGAGCGAGCCGAGAGG 
GAGCTGGAGAAGCTGCAGAACCGAGAGGATTCTTCTGAAGGCATCAGAAAGAAGCTGGTGGAAGCTG 
AGGAACGCCGCCATTCTCTGGAGAACAAGGTAAAGAGACTAGAGACCATGGAGCGTAGAGAAAACAG 
ACTGAAGGATGACATCCAGACAAAATCCCAACAGATCCAGCAGATGGCTGATAAAATTCTGGAGCTC 
GAAGAGAAACATCGGGAGGCCCAAGTCTCAGCCCAGCACCTAGAAGTGCACCTGAAACAGAAAGAGC 
AGCACTATGAGGAAAAGATTAAAGTATTGGACAATCAGATAAAGAAAGACCTGGCTGACAAGGAGAC 
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ACTGGaS^ 

AAGGCGATGATCAATGCTATGGATTCCAAGATCAGATCCCTGGAACAGAGGATTGTGGAACTGTCTG 

AAGC CAAT AAAC TTGC AGC AAATAG C AGT C TT TT TAC CC AAAGGAAC ATGAAGG CC C AAG AAG AGAT 

GATTTCTGAACTCAGGCAACAGAAATTTTACCTGGAGACACAGGCTGGGAAGTTGGAGGCCCAGAAC 

C G AAAAC TG GAGG AG C AG CTGGAGAAG ATCAGCC AC C AAGACCACAGTGACAAGAATC GGCTGC TGG 

AACTGGAGACAAGATTGCGGGAGGTGAGTCTAGAGCACGAGGAGCAGAAACTGGAGCTCAAGCGCCA 

GC TC AC AGAGCTACAGCTCTCC C TGC AGG AGCGC GAGTC ACAGTTGAC AGCCCTGCAGGCTGCACGG 

GCGGCCCTGGAGAGCCAGCTTCGCCAGGCGAAGACAGAGCTGGAAGAGACCACAGCAGAAGCTGAAG 

AGGAGATCCAGGCACTCACGGCACATAGAGATGAAATCCAGCGCAAATTTGATGCTCTTCGTAACAG 

CTGTACTGTGATCACAGACCTGGAGGAGCAGCTAAACCAGCTGACCGAGGACAACGCTGAACTCAAC 

AACCAAAACTTCTACTTGTCCAAACAACTCGATGAGGCTTCTGGCGCCAACGACGAGATTGTACAAC 

TGCG AAGTG AAG TGGAC CATC TC CGCC GGGAG ATCAC GGAACG AGAG ATGC AGC TTAC CAGC C AGAA 

GCAAACGATGGAGGCTCTGAAGACCACGTGCACCATGCTGGAGGAACAGGTCATGGATTTGGAGGCC 

CTAAACGATGAGCTGCTAGAAAAAGAGCGGCAGTGGGAGGCCTGGAGGAGCGTCCTGGGTGATGAGA 

AATCCCAGTTTGAGTGTCGGGTTCGAGAGCTGCAGAGGATGCTGGACACCGAGAAACAGAGCAGGGC 

GAGAGCCGATCAGCGGATCACCGAGTCTCGCCAGGTGGTGGAGCTGGCAGTGAAGGAGCACAAGGCT 

GAGATTCTCGCTCTGCAGCAGGCTCTCAAAGAGCAGAAGCTGAAGGCCGAGAGCCTCTCTGACAAGC 

TCAATGACC TGGAG AAGAAGCATGCTATGCTTGAAATGAATGCCCGAAGCTTACAGCAGAAGC TGG A 

GACTGAACGAGAGCTCAAACAGAGGCTTCTGGAAGAGCAAGCCAAATTACAGCAGCAGATGGACCTG 

CAGAAAAATCACATTTTCCGTCTGACTCAAGGACTGCAAGAAGCTCTAGATCGGGCTGATCTACTGA 

AGACAGAAAGAAGTGACTTGGAGTATCAGCTGGAAAACATTCAGGTGCTCTATTCTCATGAAAAGGT 

GAAAATGGAAGGCACTATTTCTCAACAAACCAAACTCATTGATTTTCTGCAAGCCAAAATGGACCAA 

CCTGCTAAAAAGAAAAAGGTGCCTCTGCAGTACAATGAGCTGAAGCTGGCCCTGGAGAAGGAGAAAG 

CTCGCTGTGCAGAGCTAGAGGAAGCCCTTCAGAAGACCCGCATCGAGCTCCGGTCCGCCCGGGAGGA 

AGCTGCCCACCGCAAAGCAACGGACCACCCACACCCATCCACGCCAGCCACCGCGAGGCAGCAGATC 

GCCATGTCTGCCATCGTGCGGTCGCCAGAGCACCAGCCCAGTGCCATGAGCCTGCTGGCCCCGCCAT 

CCAGCCGC^GAAAGGAGTCTTCAACTCCAGAGGAATTTAGTCGGCGTCTTAAGGAACGCATGCACCA 

CAATATTCCTCACCGATTCAACGTAGGACTGAACATGCGAGCCACAAAGTGTGCTGTGTGTCTGGAT 

ACCGTGCACTTTGGACGCCAGGCATCCAAATGTCTAGAATGTCAGGTGATGTGTCACCCCAAGTGCT 

CCACGTGCTTGCCAGCCACCTGCGGCTTGCCTCCTGAATATGCC^C^CACTTCACCGAGGCCTTCTG 

CCGTGACAAAATGAACTCCCCAGGTCTCCAGACCAAGGAGCCCAGCAGCAGCTTCCACCTGGAAGGG 

TGGATGAAGGTGCCCAGGAATAACAAACGAGGACAGCAAGGCTGGGAGAGGA^ 

AGGGATCAAAAGTCCTCATTTATGACAATGAAGCCAGAGAAGCTGGACAGAGGCCGGTGGAAGAATT 

TGAGC TGTGCCTTCCC GACGGGGATGTATCTATTCATGGTGCCGTTGGTGCTTCCGAACTCGCAAAT 

ACAGCCAAAGCAGATGTCCC^TACATACTGAAGATGGAATCTCACCCGC^CACCACCTGCTGGCCCG 

GGAGAACCCTCTACTTGCTAGCTCCCAGCTTCCCTGACAAACAGCGCTGGGTCACCGCCTTAGAATC 

AGTTGTC GCAGGTGGGAGAGTTTC TAGGGAAAAAGC AGAAGCTGATGC TAAACTGCTTGG AAACTCC 

CTGCTGAAACTGGAAGGTGATGACCGTCTAGACATOAACTGCACGCTGCCCTTCAGTGACCAGGTAG 

TGTTGGTGGGCACCGAGGAAGGGCTCTACGCCCTGAATGTCTTGAAAAACTCCCTAACCCATGTCCC 

AGGAATTGGAGCAGTCTTCOIAATTTATATTATCAAGGACCTGGAGAAGCTACTC^TGATAG 

GAAGAGCGGGCACTGTGTCTTGTGGACGTGAAGAAAGTGAAACAGTCCCTGGCCCAGTCCCACCTGC 

C TGCCCAGCCCGACATCTC ACCC AAC ATTTTTGAAGCTGTCAAGGGCTGC C ACTTGTTTGGGGC AGG 

CAAGATTGAGAACGGGCTCTGCATCTGTGC^GCCATGCCCAGCAAAGTCGTC^TTCTCCGCTA 

GAAAACCTCAGCAAATACTGCATCCGGAAAGAGATAGAGACCTCAGAGCCCTGCAGCTGTATCCACT 

TCACCAATTACAGTATCCTCATTGGAACCAATAAATTCTACGAAATCGACATGAAGCAGTACACGCT 

CGAGGAATTCCTGGATAAGAATGACCATTCCTTGGCACCTGCTGTGTTTGCCGCCTCTTCCAACAGC 

TTCCCTGTCTCAATCGTGCAGGTGAACAGCGC^GGGCAGCGAGAGGAGTACTTGCTGTCTTTCCACG 

AATTTGGAGTGTTCGTGGATTCTTACGGAAGACGTAGCCGCACAGACGATCTCAAGTGGAGTCGCTT 

ACCTTTGGCCTTTGCCTACAGAGAACCCTATCTGTTTGTGACCCACTTCAACTCACTCGAAGTAATT 

GAGATC C AGGC ACGC TCCTCAGCAGGGAC CC C TGCCCGAGCGTACCTGGAC ATCCCG AACCCGCGCT 

ACCTGGGCCCTGCCATTTCCTCAGGAGCGATTTACTTGGCGTCCTCATACCAGGATAAATTAAGGGT 

CATTTGCTGC^GGGAAACCTCGTGAAGGAGTCCGGCACTGAACACCACCGGGGCCCGTCCACCTCC 

CGCAGCAGCCCCAACAAGCGAGGCCCACCCACGTACAACGAGCACATCACCAAGCGCGTGGCCTCCA 

GCCCAGCGCCGCCCGAAGGCCCCAGCCACCCGCGAGAGCCAAGCACACCCCACCGCTACCGCGAGGG 

GCGGACCGAGCTGCGCAGGGACAAGTCTCCTGGCCGCCCCCTGGAGCGAGAGAAGTCCCCCGGCCGG 

ATGCTCAGCACGCGGAGAGAGCGGTCCCCCGGGAGGCTGTTTGAAGACAGCAGCAGGGGCCGGCTGC 

C TGCGGG AG CCGTGAGGACCCCGCTGTCCCAGGTGAACAAGGTGAGGCAGCATTCCGAGGC CTGTG T 

GTCTGTTGCGGAGGCCAGGAGTGACTTGGGGAACTGA 

ORF Start: ATG at 1 j }ORF Stop: TGA at 6199 ~ 





SEQ ID NO: 14 |2066 aa JMW at 236008.5kD 


NOVlg, 
CG106764-02 
Protein 
Sequence 


M^FKYGAI^PLDAGAAEPIASRASRLmiFFQGKPPFMTQQQMSPLSREGILDALFVLFEECSQP^ 

MKIKHVSNFVRKCSDTIAELQELQPSAKDFEVRSLVGCGOT 

AQEQVSFFEEERNILSRSTSPWIPQLQYAFQDRNHLYLVMEY 

L AEL ILAVHS VHLMGYVHRD IKPENILVDRTGH IIOliVDFGS AAKI^SNKVNAKL P IGT PDYMAPEVL 
TVMNGDGKGT YGLDCDWWSVGVIAYEMI YGRS PFAEGT SARTFNNIMNFQRFLKFPDDPKVS SDFLD 
LIQSLLCGQKERLKFEGLCCHPFFSKIDWNNIRNAPPPFVPTLKSDDDTSNFDEPEKNSWVSSSPCQ 
LSPSGFSGEELPFVGFSYSKALGILGRSESWSGLDSPAKTSSMEKKLLIKSKELODSODKCHKMEO 



105 



WO 03/029424 



PCT/US02/31373 



EMTIUjHRRVSEVEAVLSQK^ 

llhdireqsrklqeikeqeyqaqveemrlmmnqleedlvsarrrsdl 
ecqhkllkakdqgkpevgeyaklekinaeqqlkiqelqe^ 
eleklqnredssegirkklveaeerrhslenkvkrletmerr^^ 
eekhreaqvsaqhlevhlkqkeqhyeekikvldnqikkdi^ 

kaminamdskirsleqr i velseanklaans slftqrnmkaqeemiselrqqkfyletqagkleaqn 

RKLEEQLEKISHQDHSDKNE^LELETRLREVSLEHEEQKLEIiKRQIjTELQLSLQERESQLTALQA^ 
AALESQLRQAKTELEETTAEAEEEIQALTAHRDEIQRKFDALRNSCWITDLEEQLNQLTEDNAELN 
NQNFYLSKQLDEASGAITOEWQLRSEVDHLR^ 

IOT^LEKERQWEA^SVLGDEKSQFECRTOELQRMLDTEKQSRARADQM 
EILALQQALKEQKLKAESLSDKLNDLEKKHAML 
QKI^FRLTQGLQEAQDRADLLKTERSDLEYQLENIQ 
PAKKKKVPLQYI^LKLALEKEKARCAELEEALQKTRIEM 

AMS AIVRS PEHQ P SAMSLLAP PS SRRKES ST PEEFSRRLKERMHHNI PHRFNVGLNMRATKCAVCLD 
TVHFGRQASKCLECQVMCHPKCSTCLPATCGLPAEYATHFTEAFCRDKMNSPGLQTKEPSSSLHLEG 
WMKVPRNNKRGQQGWDRKYIVLEGSKV^^ 
TAKADVPYILKMESHPHTTCWGRTLYLLAPSFPDKQRWT^ 

LLKLEGTODRLDlyiNCTLPFSDQVVLVGTEEGLYALNVLKNSLTHVPGIGAVFQ IYI IKDLEKLLMI AG 
EERALCLVDVKKVKQSLAQSHLPAQPDI S PN IFEAVKGCHLFGAGKIENGLC ICAAMPSKWILRYN 
E^SKYCIRKEIETSEPCSCIHFTETtfSILIGTN^^ 

FPVSIVQWSAGQREEYLLCFHEFGVFVDSYGRRSRTDDLKWSRLPLAFAYREPYLFVTHFNSLEVI 
EIQARSSAGTPARAYLDIPNPRYLGPAISSGAIYIiASSYQDKLRVICCKGNLVKESGTEHHRGPSTS 
RSS PNBGRGPPTYNEHITKRVASSPAP PEGPSHPREPS TPHRYREGRTELRRDKS PGRPLEREKSPGR 
ML STRRERS PGRLFEDS SRGRLPAGAVRTPL SQVNKVRQHSEAC VS VAEARSDLGN 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table IB. 

5 



Table IB. Comparison of NOVla against NOVlb through NOVlg. 


Protein Sequence 


NOVla Residues/ 


Identities/ 


Match Residues 


Similarities for the Matched Region 


NOVlb 


1-615 


601/616(97%) 




5..620 


602/616 (97%) 


NOVlc 


615..1442 


690/828 (83%) 




4..831 


691/828 (83%) 


NOVld 


615..1442 


690/843(81%) 




4..846 


691/843 (81%) 


NOVle 


1436..2053 


618/618 (100%) 




3..620 


618/618 (100%) 


NOVlf 


1436..2053 


618/633 (97%) 




3..635 


618/633 (97%) 


NOVlg 


1..2051 


1900/2051 (92%) 




1..2051 


1900/2051 (92%) 



Further analysis of the NOVla protein yielded the following properties shown in 
10 Table 1C. 
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Table 1C. Protein Sequence Properties NOVla 


PSort analysis: 


0.9800 probability located in nucleus; 0.3000 probability located in 
microbody (peroxisome); 0.1000 probability located in mitochondrial matrix 
space; 0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 



5 A search of the NOVla protein against the Geneseq database, a proprietary 

database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table ID. 



Table ID. Geneseq Results for NOVla 


Geneseq 
Identifier 


±to lein/^jrganisnj/ j^cngin 
[Patent*, Date] 


NOVla 

l^Ad fill CIO / 

Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Value 


AAU03501 


Human protein kinase #1 - 
Homo sapiens, 2053 aa. 
[WO200138503-A2, 
31-MAY-2001] 


1..2051 
1..2053 


2044/2053 (99%) 
2046/2053 (99%) 


0.0 


AAB43359 


Human ORFXORF3 123 
polypeptide sequence SEQ 
IDNO:6246-Homo 
sapiens, 1286 aa. 
[WO200058473-A2, 
05-OCT-2000] 


768..2053 
1..1286 


1286/1286 (100%) 
1286/1286 (100%) 


0.0 


ABB 11117 


Human RHO/RAC effector 
homologue, SEQ ID 
NO: 1487 - Homo sapiens, 
999 aa. 

[WO200157188-A2, 
09-AUG-2001] 


968..1947 
1..980 


976/980(99%) 
976/980 (99%) 


0.0 


AAU31443 


Novel human secreted 
protein #1934 -Homo 
sapiens, 910 aa. 
[WO200179449-A2, 
25-OCT-2001] 


1114..1982 
1..869 


867/869 (99%) 
867/869 (99%) 


0.0 


AAE16261 


Human kinase PKEN-7 
protein - Homo sapiens, 497 
aa. [WO200196547-A2, 
20-DEC-2001] 


1..467 
1..468 


463/468 (98%) 
465/468 (98%) 


0.0 
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In a BLAST search of public sequence datbases, the NOV la protein was found to 
have homology to the proteins shown in the BLASTP data in Table IE. 
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Table IE. Public BLASTP Results for NOVla 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVla 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 

t 

Matched rortion 


Expect 
Value 


088938 


Rho/rac-interacting citron 
kinase - Mus musculus 
(Mouse), 2055 aa. 


L.2053 
L.2055 


1974/2055 (96%) 
2014/2055 (97%) 


0.0 


088528 


Citron-K kinase - Mus 
musculus (Mouse), 1641 aa 
(fragment). 


373..2053- 
L.1641 


1599/1683 (95%) 
1616/1683(96%) 


0.0 


P49025 


Citron protein 
(Rho-interacting, 
serine/threonine kinase 21) - 
Mus musculus (Mouse), 
1597 aa. 


467..2053 
9..1597 


i cn'y / icon /c\oof\ 

i5o3/i5by (ys%) 

1578/1589(98%) 


A A • 
U.U 


Q9QX19 


Postsynaptic density protein 
- Rattus norvegicus (Rat), 
1618 aa. 


467..2053 
1..1618 


1556/1619(96%) 
1573/1619(97%) 


0.0 


014578 ; 


Citron protein 
(Rho-interacting, 
serine/threonine kinase 21) - 
Homo sapiens (Human), 
1286 aa (fragment). 


768..2053 
1..1286 


1286/1286 (100%) 
1286/1286(100%) 


0.0 



PFam analysis predicts that the NOVla protein contains the domains shown in the 
Table IF. 
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Table IF. Domain Analysis of NOVla 


Pfam Domain 


NOVla Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


pkinase 


97..359 


89/302(29%) 
196/302(65%) 


2.7e-62 


pkinase_C 


360..389 


15/32(47%) 
24/32(75%) 


0.00023 
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DAG_PE-bind 


1389.. 1437 


14/51 (27%) 
34/51 (67%) 


6.1e-05 


PH 


1470..1589 


20/121 (17%) 
87/121 (72%) 


1.8e-ll 


CNH 


1618..1915 


107/378 (28%) 
289/378 (76%) 


1.5e-110 



Example 2. 

The NOV2 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 2A. 



Table 2A. NOV2 Sequence Analysis 




SEQIDNO: 15 


1238 bp | 


NOV2a, 
CGI 17662-01 
DNA Sequence 


ATGGATGGATGGAGAAGGATGCCTCGCTGGGGACTGCTGCTGCTC^ 

TCTCCCGACAGACACCACCACCTTTAAACGGATCTTCCTCAAGAGAATGCCCTCAATCCGAGAAAGCC 
TGAAGGAACGAGGTGTGGACATGGCCAGGCTTGGTCCCGAG 

CTTGGCAACACCACCTCCTCCGTGATCCTCACCAACTACATGGACACCCAGTACTAa?GGCGAGATTGG 

CATCGGC7^CCCCACCCCAGACCTTCAAAGTCGTCTTTGACACTGGTTCGTCCAATC 

CCTCCAAGTGCAGCCGTCTCTACACTGCCTGTGTGTATC 

AGCTACAAGCACAATGGAACAGAACTCACCCTCCGCTATTCAACAGGGACAGTCAGTGGCTTTCTCAG 

CCAGGACATCATCACCGTGGGTGGAATCACGGTGACACAGATGTTTGG 

CCTTACCCTTCATGCTGGCCGAGTTTGATGGGGTTGTGGX^^ 

AGGGTCACCCCTATCTTCGACAAGATCATCTCCG^ 

CTACAACAGAGATTCCGAGAATTCCCAATCGCTGGGAGGAC^ 

AGCATTACGAAGGGAATTTCCACTATATCAACCTCATCAAGACTGGTGTCTGGCAGATTCAAATGAAG 
GGGGTGTC TGTGGGGTCATCCACCTTGCTCTGTGAAGACGGCTGCC TGGCATTGGTAGACACCGGTGC 
ATCCTACATCTCAGGTTCTACCAGCTCCATAGAGAAGCTC^TGGAGGCCTTGGGAGCCAAGAAGAGGC 
TGTTTGATTATGTCGTGAAGTGTAACGAGGGCCCTACACTCCCCGACATCTCTTTCCACCTGGGAGGC 
AAAGAATACACGCTCACCAGCGCGGACTATGTATTTCAGGAATCCTACAGTAGTAAAAAGCTGTGCAC 
ACTGGCCATCCACGCCATGGATATCCCGCCACCCACTGGACCCACCTGGGCCCTGGGGGCCACCTTCA 
TCCGAAAGTTCTACACAGAGTTTGATCGGCGTAACAACCGCATTGGCTTCGCCTCGGCCCGCT6AGGC 
CCTCTGCCACCCAG 




ORF Start: ATG at 1 | 


jORFStop: TGAat 1219 





SEQIDNO: 16 


406 aa 


MWat45030.9kD 


NOV2a, 
CGI 17662-01 
Protein 
Sequence 


MDGWKRMPRWGLLLLLWGSCTFGLPTDTTTFKRI^ 

LGNTTSSVILTNYMDTQYYGEIGIGTPPQTFKVVFDTGSSNV^ 

SYKHNGTELTLRYSTGTVSGFLSQDIITVGGIWTQMFGOT 

RVTPIFDNI I SQGVLKEDVFSFYYNRDSENSQSLGGQJVLGGSDPQHYEGNFHYINLIKTGVWQIQ^ 

GVSVGSSTLLCEDGCLALVDTGASYISGSTSSIEKLMEALGAKKR^ 

KEYTLTSADYVFQESYSSKKLCTLAIHAMD^^ 
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SEQIDNO: 17 


911 bp 




NOV2b, 
CGI 17662-02 
DNA Sequence 


ATGGATGGATGGAGAAGGATGCCTCGCTGGGGACTGCTGCTGCTGCTCTGGGGCTCCTGTACCTTTG 
GTCTCCCGACAGACACCACCACCTTTAAACGGATCTTCCTCAAGAGAATGCCCTCAATCCGAGAAAG 
CC TGAAGGAACGAGGTGTGGAC ATGGCCAGGC TTGGTCC CG AGTGGAGCC AACCC ATGAAG AGGCTG 
ACACTTGGCAACACCACCTCCTCCGTGATCCTCACCAACTACATGGACACCCAGTACTATGGCGAGA 
TTGGCATCGGCAC CCCACCCCAGAC CTTC AAAGTCGTCTTTGAC ACTGGTTCGTC CAATGTT TGGGT 
GCCCTCCTCCAAGTGCAGCCGTCTCTACACTGCCTGTGTGTATCACAAGCTCTTCGATGCTTCGGAT 
TCCTCCAGCTACAAGCACAATGGAACAGAACTCACCCTCCGCTATTCAACAGGGACAGTCAGTGGCT 
TTCTCAGCCAGGACATCATCACCGTGTCTGTGGGGTCATCCACCTTACTCTGTGAAGACGGCTGCCT 
GGCATTGGTAGACACCGGTGCATCCTACATCTCAGGTTCTACCAGCTCCATAGAGAAGCTCATGGAG 
GCCTTGGGAGCCAAGAAGAGGCTGTTTGATTATGTCGTGAAGTGTAACGAGGGCCCTACACTCCCCG 
ACATCTCTTTCCACCTGGGAGGCAAAGAATACACGCTCACCAGCGCGGACTATGTATTTCAGGAATC 
CTACAGTAGTAAAAAGCTGTGCACACTGGCCATCCACGCCATGGATATCCCGCCACCCACTGGACCC 
ACCTGGGCCCTGGGGGCCACCTTCATCCGAAAGTTCTACACAGAGTTTGATCGGCGTAACAACCGCA 
TTGGCTTCGCCTCGGCCCGCTGAGGCCCTCTGCCACCCAG 




ORF Start: ATG at 1 j 


ORF Stop: TGA at 892 



5 





SEQIDNO: 18 


297 aa 


MWat33025.3kD 


NOV2b, 
CGI 17662-02 
Protein 
Sequence 


MDGWRRKPRWGLLLLLWGSCTFGLPTOT 

TLGNTTSSVILTNYMOTQYYGEIGIGTPPQTFKVVFDTGSSNVWPSSKCSRLYTA 
SSSYKHNGTELTLRYSTGTVSGFLSQDI ITVSVGSSTLLCEDGCLALVDTGASYISGSTSS IEKLME 
ALGAKKRLFDYWKCNEG PTL PD I S FHLGGKEYTL TS ADYVFQES YS S KKLC TLA IHAMD I P P P TG P 
TWALGATFIRKFYTEFDRRNNRIGFASAR 



10 Sequence comparison of the above protein sequences yields the following sequence 

relationships shown in Table 2B. 



Table 2B. Comparison of NOV2a against NQV2b. 


Protein Sequence 


NOV2a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV2b 


1..165 
L.165 


165/165 (100%) 
165/165 (100%) 
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Further analysis of the NOV2a protein yielded the following properties shown in 
Table 2C. 



Table 2C. Protein Sequence Properties NOV2a 


PSort analysis: 


0.3700 probability located in outside; 0.2541 probability located in microbody 
(peroxisome); 0. 1900 probability located in lysosome (lumen); 0.1000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 24 and 25 
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A search of the NOV2a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
5 several homologous proteins shown in Table 2D. 



Table 2D. Geneseq Results for NOV2a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV2a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAW23244 


Human renin - Homo sapiens, 
406 aa. [W09728684-A1, 
14-AUG-1997] 


1..406 
1..406 


404/406 (99%) 
404/406 (99%) 




0.0 


AAP50135 


Sequence of pre-pro-renin - 
Homo sapiens, 406 aa. 
[EP135347-A, 
27-MAR-1985] 


1..406 
1..406 


404/406 (99%) 
404/406 (99%) 


0.0 


ABB11781 


Human renin homologue, 
SEQ1DNO:2151-Homo 
sapiens, 438 aa. 
[WO200157188-A2, 
09-AUG-2001] 


1..406 
31. .438 


391/408 (95%) 
393/408 (95%) 


0.0 


AAU72879 


Human aspartyl protease 
partial protein sequence #4 - 
Homo sapiens, 412 aa. 
[WO200183782-A2, 
08-NOV-2001] 


24..405 
14..409 


169/400 (42%) 
246/400(61%) 


le-90 


AAY93685 


Amino acid sequence of 
novel polypeptide PR0292 - 
Homo sapiens, 412 aa. 
[WO200037640-A2, 
29-JUN-2000] 


24..405 
14..409 


169/400 (42%) 
246/400 (61%) 


le-90 



10 In a BLAST search of public sequence datbases, the NOV2a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 2E. 



Table 2E. Public BLASTP Results for NOV2a 
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Protein 

Accession 

Number 


Protein/Organism/Length 


NOV2a 

Residues/ 

Match 
Residues 


Identities/ 
Similarities for* 
the Matched 
Portion 


Value 


P00797 


Renin precursor, renal (EC 
3.4.23.15) 

(Angiotensinogenase) - Homo 
sapiens (Human), 406 aa. 


1..406 
1..406 


405/406 (99%) 
405/406 (99%) 


0.0 


Q9TSZ1 


Preprorenin precursor (EC 
3.4.23.15) - Callithrix jacchus 
(Common marmoset), 400' aa. 


1..406 
1..400 


381/406 (93%) 
389/406 (94%) 


0.0 


P52115 


Renin precursor, renal (EC 
3.4.23.15) 

(Angiotensinogenase) - Ovis 
aries (Sheep), 400 aa. 


7..406 
1..400 


292/401 (72%) 
338/401 (83%"> 


e-175 


Q15296 


Kidney mRNA fragment for 
renin (Aa 105401) - Homo 
sapiens (Human), 300 aa 
(fragment). 


108..406 
1..300 


297/300(99%) 
298/300 (99%) 


e-172 


P06281 


Renin precursor, renal (EC 
3.4.23.15) 

(Angiotensinogenase) - Mus 
musculus (Mouse), 402 aa. 


5..406 
4.;402 


281/403 (69%) 
331/403 (81%) 


e-167 



PFam analysis predicts that the NOV2a protein contains the domains shown in the 
Table 2F. 



Table 2F. Domain Analysis of NOV2a 


Pfam Domain 


NOV2a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


asp 


31..405 


174/428(41%) 
339/428 (79%) 


4.1e-197 



Example 3. 

The NOV3 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 3 A. 



Table 3A. NOV3 Sequence Analysis 
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SEQ ID NO: 19 |2827 bp ih tt 


v^w a-org / a x z$ t j 


NOV3a, 
CG118051-01 
DNA Sequence 


TGGCGATGCTACTGTTTAATTGCAGGAGGTGGGGGTGTGTGTACCATGTACCAGGGCTATTAGAAGCA 


AGAAGGAAGGAGGGAGGGCAGAGCGCCCTGC TGAGCAACAAAGG ACTC C TGC AGCCTTCTCTGTCTGT 


CTCTTGGCACAGGCACATGGGGAGGCCTCCCGCAGGTGGGGGGCCACCAGTCCAGGGGTGGGAGCACT 


ACAGGGCACGAGTTGGTTTGGGAGCTGCCAGTCTCCTGGGAGGATCGCAGTCAGCAGAGCAGGGCTGA 


GGC CTGGGGGTAGGAGCAGAGCCTGCGCATCTGG AGGC AGC ATGTC CAAGAAAGGGAGTGGAGGTGCA 


GCGAAGGACCCAGGGGCAGAGCCCACGCTGGGGATGGACCCCTTCGAGGACACACTGCGGCGGCTGCG 


TGAGGCCTTCAACTGAGGGCGCACGCGGCCGGCCGAGTTCCGGGCTGCGCAGCTCCAGGGCCTGGGCC 


ACTTCCTTCAAGAAAAC AAGC AGC TTCTGCGCGACGTGCTGGCC CAGGACC TGC ATAAGCCAGCTTTC 


GAGGCAGACATATCTGAGCTC ATC CTTTGCC AGAACGAGGTTGACTACGC TCTC AAGAACCTTC AGGC 


CTGGAT6AAGGATGAACCACGGTCCACGAACCTGTTCATGAAGCTGGACTCGGTCTTCATCTGGAAGG 
AACCCTTTGGCCTGGTCCTCATCATCGCACCCTGGAACTACCCATTGAACCTGACCCTGGTGCTCCTG 
GTGGGCACCCTCCCCGCAGGGAATTGCGTGGTGCTGAAGCCGTCAGAAATCAGCCAGGGCACAGAGAA 
GGTCCTGGCTGAGGTGCTGCCCCAGTACCTGGACCAGAGCTGCTTTGCCGTGGTGCTGGGCGGACCCC 
AGGAGACAGGGCAGCTGCTAGAGCACAAGTTGGACTACATCTTCTTCACAGGGAGCCCTCGTGTGGGC 
AAGATTGTC ATGACTGC TGCCACCAAGCACCTGACGC C TGTCACCC TGGAGCTGGGGGGCAAGAACCC 
CTGCTACGTGGACGACAACTGCGACCCCCAGACCGTGGCCAACCGCGTGGCCTGGTTCTGCTACTTCA 
ATGCCGGCCAGAC CTGCGTGGCCC CTGACTACGTCCTGTGC AGC CC CGAGATGCAGGAGAGGCTGC TG 
CCCGCCC TGCAGAGCACC ATC ACC CGTTTCTATGGCGACG ACCCCC AGAGC TCCCCAAACCTGGGCCG 
GATCATCAACCAGAAACAGTTCCAGCGGCTGCGGGCATTGCTGGG^ 

GCCAGAGCAACGAGAGCGATCGCTACATCGCCCCCACGGTGCTGGTGGACGTGCAGGAGACGGAGCCT 

GTGATGCAGGAGGAGATCTTCGGGCCCATCCTGCCCATCGTGAACGTGCAGAGCGTGGACGAGGCCAT 

CAAGTTCATCAACCGGCAGGAGAAGCCCCTGGCCCTGTACGCCTTCTCCAACAGCAGACAGGTTGTGA 

ACCAGATGCTGGAGCGGACCAGCAGCGGCAGCTTTGGAGGCAATGAGGGCTTCACCTACATATCTCTG 

CTGTCCGTGCCATTCGGGGGAGTCGGCCAC^GTGGGATGGGCCGGTACCACGGCAAGT^ 

CACCTTCTCCCACCACCGCACCTGCCTGCTCGCCCCCTCCGGCCTGGAGAAATTAAAGGAGATCCGCT 

ACCCACCCTATACCGACTGGAACCAGCAGCTGTTACGCTGGGGCATGGGCTCCCAGAGCTGCACCCTC 

CTGTGAGCGTCCCACCCGCCTCCAACGGGTCACACAGAGAAACCTGAGTCTAGCCATGAGGGGCTTAT 


GCTCCCAACTCACATTGTTCCTCCAGACCGCAGGCTCCCCCAGCCTCAGGTTGCTGGAGCTGTCACAT 


GACTGC ATCCTGC CTGCC AGGGCTGCAAAGC AAGGTC TTGC TTCTATCTGGGGGACGCTGCTCGAGAG 


AGGCCGAGAGGCCGCAGAAC ATGCCAGGTGTCCTC ACTC AC CCCACCCTCC CCAATTCC AGCCCTTTG 


CCCTCTCGGTCAGGGTTGGCCAGGCCCAGTCACAGGGGCAGTGTCACCCTGGAAAATACAGTGCCCTG 


CCTTCTTAGGGGCATCAGCCCTGAACGGTTGAGAGCGTGGAGCCCTCCAGGCCTTTGCTCTCCCCTCT 


AGGCACACGCGCACTTCCACCTCTGCCCCATCCCAACTGCACCAGCACTGCCTCCCCCAGGGATCCTC 


TCACATCCCACACTGGTCTCTGCACCACCCCTCTGGTTCACACCGCACCCTGCACTCACCCACAGCAG 


CTCCATC CACTGGGAAAACTGGGGTTTGCATCAC TCCACTGCACAGTGTTAGTGGGACCTGGGGGCAA 


GTCCCTTG ACTTCTCTGAGCC TCAGTTTCCTTATGTGAAAGTTGC TGGAACC AAAATGGAGTC ACTTA 




TGCCAAACTC TAATAAAATGGAGTCGGGGGGGC AC ATAGAAGCCC TCACAC ACACATGCCCGTAAC AG 


GATTTATCACCAAGACACGCCTGCATGTAAGACCAGACACAGGGCGTATGGAAAAGCACGTCCTC^AA 




GACTGTAGTATTCCAGATGAGCTGCAGATGCTTACCTACCACGGCCGTCTCCACCAGAAAACCATCGC 


CAACTCCTGCGATC AGC TTGTGACTTAC AAACCTTGTTTAAAAGCTGC TTACATGGACTTCTGTCC TT 




TAAAACGTTCCCC TTGGCTGTGGCCCTC TGTGTATGCCTGGGATCC TTCC AAGC ACTCATAGCCCAGA 




TAGGAATCCTCTGCTCCTCCC AAATAAATTCATC TGTTC 




ORF Start: ATG at 617 


ORF Stop: TGA at 1772 j 





SEQ ID NO: 20 


385 aa ]MW at 42794. 8kD 


NOV3a, 
CGI 18051-01 
Protein 
Sequence 


MKDEPRSTNLFMKLDSVFIWKEPFGLVLIIAPWOTPLNLTLVL 
IAEVLPQYLDQSCFAVVLGGPQETGQLLEHKLDYIFFTGSPRVGKIVOT 

YVDDNCDPQTVA^VAWFCYFNAGQTCVAPDYVLCSPEMQERLLPALQSTITRFYGDDPQSSPNLGRI 
INQKQFQRLRALLGCGRVAIGGQSNESDRYIAPTVLVDVQETEPVMQEEIFGPILPIVNVQSVDEAIK 
FimQEKPLALYAFSNSRQVVNQMLERTSSGSFGGNEGFTYISLLSVPFGGVGHSGMGRYHGKFTFDT 
FSHHRTCLLAPSGLEKLKEIRYPPYTDWNQQLLRWGMGSQSCTIiL 





SEQ ID NO: 21 1586 bp i 


NOV3b, 
CGI 18051-02 
DNA Sequence 


CACGAGTTGGTTTGGGAGCTGCCAGTCTCCTGGGAGGATCGCAGTCAGCAGAGCAGGGCTGAGGCCT 


GGGGGTAGGAGCAGAGCCTGCGCATCTGGAGGCAGCATGTCCAAGAAAGGGAGTGGAGGTGCAGCGA 


AGGACCCAGGGGCAGAGCCCACGCTGGGGATGGACCCCTTCGAGGACACACTGCGGCGGCTGCGTGA 


GGCCTTCAACTGAGGGCGCACGCGGCCGGCCGAGTTCCGGGCTGCGCAGCTCCAGGGCCTGGGCCAC 


TTCCTTCAAGAAAACAAGCAGCTTCTGCGCGACGTGCTGGCCCAGGACCTGCATAAGCCAGCTTTCG 


AGGC AGACATATCTGAGCTCATCCTTTGCCAGAACG AGGTTGACTACGC TCTC AAGAACCTTC AGGC 


C TGGAT6AAGGATGAACCACGGTCC ACG AACCTGTTCATGAAGCTGGACTCGGTC TTCATCTGGAAG 
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GAACCCOTTGGCCTGGTCCTC&TCATCGCACCCT 

TGGTGGGC ACCC TCC CCGC AGGGAATTGCGTGGTGCTGAAGCCGTC AGAAATCAGCCAGGGC AC AGA 
GAAGGTCCTGGCTGAGGTGCTGCCCCAGTACCTGGACCAGAGCTGCTTTGCCGTGGTGCTGGGCGGA 
CCCCAGGAGACAGGGCAGCTGCTAGAGCACAAGTTGGACTACATCTTCTTCACAGGGAGCCCTCGTG 
TGGGCAAGATTGTCATGACTGCTGCCACCAAGCACCTGACGCCTGTCACCCTGGAGCTGGGGGGCAA 
GAACCCCTGCTACGTGGACGACAACTGCGACCCCCAGACCGTGGCCAACCGCGTGGCCTGGTTCTGC 
TACTTCAATGCCGGCCAGACCTGCGTGGCCCCTGACTACGTCCTGTGCAGCCCCGAGATGCAGGAGA 
GGCTGCTGCCCGCCCTGCAGAGCACCATCACCCGTTTCTATGGCGACGACCCCCAGAGCTCCCCAAA 
CCTGGGCCGCATCATCAACCAGAAACAGTTCCAGCGGCTGCGGGCATTGCTGGGCTGCGGCCGCGTG 
GCCATTGGGGGCCAGAGCAACGAGAGCGATCGCTACATCGCCCCCACGGTGCTGGTGGACGTGCAGG 
AGACGGAGCCTGTGATGCAGGAGGAGATC TTCGGGCCCATCC TGCCCATCGTGAACGTGCAGAGCGT 
GGACGAGGCCATCAAGTTCATCAACCGGCAGGAGAAGCCCCTGGCCCTGCACAGTGGGATGGGCCGG 
TACC^CGGCAAGTTCACCTTCGACACCTTCTCCCACCACCGCACCTGCCTGCTCGCCCCCTCCGGCC 
TGGAGAAATTAAAGGAGATCCGCTACCCACCCTATACCGACTGGAACCAGCAGCTGTTACGCTGGGG 
CATGGGCTCCCAGAGC TGCACCCTCC TGTGAGCGTCC CACCCGCCTCC AACGGGTC ACACAGAGAAA 


CCTGAGTCTAGCCATGAGGGGCTTATGCTCCCAACTCACATTGTTCCTCCAGACCGCAGGCTCCCrr 


AGCCTCAGGTTGCTGGAGCTGTCACATGACTGCATCCTGCCTGCC 




ORF Start: ATG at 407 | 


ORFStop: TGAat 1436 





SEQ ID NO: 22 ]343 aa MW at 38350.9kD 


NOV3b, 
CGI 18051-02 
Protein 
Sequence 


MKDEPRSTNLFMKiDSVFIWKEPFGLVLI^ 
VLAEVLPQYIJ3QSCFAVVLGGPQETGQLLEHKLDYIFFTGSP 

PCYVDDNCDPQTVAMIVAWFCYFNAGQTCVAPDYVLCSPEMQERLLPALQSTITRFYGDDPQSSP^ 
GRIINQKQFQRLRALLGCGRVAIGGQSNESDRYIAPTVLTO^ 

EAIKF INRQEKPLALHSGMGRYHGKFTFDTFSHHRTCLLAPSGLEKLKEIRYP P YTDWNQQLLRWGM 
GSQSCTLL 





SEQ ID NO: 23 


|1791 bp | 


NOV3c, 
CGI 18051-03 
DNA Sequence 


T TAAGG AG AATC TTAAAGTGAGGG C TGAGGG AC TCTCCTG ATC C AG AGC TGAGGACTC TCC TG ATCC A 


GAGCTGAGGGCTC TCCTGATGGACCCC TTCGAGGAC ACGCTGCGGCGGCTGCGTGAGGCCTTC AACTG 


AGGGCGCACGCGGCCGGCCGAGTTCCGGGCTGCGCAGCTCCAGGGCCTGGGCCACTTCCTTCAAGAAA 


ACAAGCAGCTTCTGCGCGACGTGCTGGCCCAGGACCTGCATAAGCCAGCTTTCGAGGCAGACATATCT 


GAGCTCATCCTTTGCCAGAACGAGGTTGACTACGCTCTCAAGAACCTTCAGGCCTGGATGAAGGATGA 


ACCACGGTCCACGAACCTGTTCATGAAGCTGGACTCGGTCTTCATCTGGAAGGAACCCTTTGGCCTGG 
TCCTCATCATCGCACCCTGGAACTACCCACTGAACCTGACCCTGGTGCTCCTGGTGGGCGCCCTCGCC 
GCAGGGAATTGCGTGGTGCTGAAGCCGTCAGAAATCAGCCAGGGCACAGAGAAGGTCCTGGCTGAGGT 
GCTGCCCGAGTACCTGGACGAGAGCTGCTTTGCCGTGGTGCTG^ 

TGCTAGAGC^CAAGTTGGACTACATCTTCTTCACAGGGAGCCCTCGTGTGGGCAAGATTGTCATGACT 

GCTGCCACCAAGCACCTGACGCCTGTCACCCTGGAGCTGGGGGGCAAGAACCCCTGCTACGTGGACGA 

CAACTGCGACCCCCAGACCGTGGCCAACCGCGTGGCCTGGTTCTGCTACTTCAATGCCGGCCAGACCT 

GCGTGGCCCCTGACTACGTCCTGTGCAGCCCCGAGATGCAGGAGAGGCTGCTGCCCGCCCTGCAGAGC 

ACCATCACCCGTTTCTATGGCGACGACCCCCAGAGCTCCCCAAACCTGGGCCGCATCATCAACCAGAA 

ACAGTTCCAGCGGCTGCGGGCATTGCTGGGCTGCGGCCGCGTGGCCATTGGGGGCCAGAGC^ 

GCGATCGCTACATCGCCCCCACGGTGCTGGTGGACGTGCAGGAGACGGAGCCTGTGATGCAGGAGGAG 

ATCTTCGGGCCCATCCTGCCCATCGTGAACGTGCAGAGCGTGGACGAGGCCATCAAGTTCATCAACCG 

GC AGGAGAAGC CCCTGGCC CTGTACGCC TTC TCCAACAGCAGCCAGGTTGTGAACC AGATGCTGGAGC 

GGACCAGCAGCGGCAGCTTTGGAGGCAATGAGGGCTTCACCTACATATCTCTGCTGTCCGTGCCATTC 

GGGGGAGTCGGCCACAGTGGGATGGGCCGGTACCACGGCAAGTTCACCTTCGACACCTTCTCCCACCA 

CCGCACCTGCCCGCTCGCCCCCTCCGGCCTGGAGAAATTAAAGGAGATCCGCTACCCACCCTATACCG 

ACTGGAACCAGCAGCTGTTACGCTGGGGCATGGGCTCCCAGAGCTGCACCCTCCTGTGAGCGTCCCAC 

CCGCCTCCAACGGGTCACACAGAGAAACCTGAGTCTAGCCATGAGGGGCTTATGCTCCCAACTCACAT 


TGTTCCTC CAGACCGCAGGC TCCCCCAGCC TCAGGTTGCTGGAGC TGTC AC ATGACTGCATCCTGCCT 


GCCAGGGCTGCAAAGCAAGGTCTTOCTTCTATCTGGGGGACGCTGCTCGAGAGAGGCCGAGAGGCCGC 


AGAACATGCCAGGTGTCCTCACTCACCCCACCCTCCCCAATTCCAGCCCTTTGCCCTCTCGGTCAGGG 




TTGACCAGGCCAAGGGCTAGCAT 




ORF Start: ATG at 330 


(ORFStop: TGAat 1485 
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SEQIDNO:24 


385 aa |MW at 42653.5kD 


NOV3c, 
CGI 18051-03 
Protein 
Sequence 


MKDEPRSTNLFMKLDSWIWKEPFGLVLIIAPWNYPLNLTLV^ 

IAEVLPQYLDQSCFAVVLGGPQETGQLLEHKLDYIFFTGSPRVGKIVMTAATKHL^ 

YVDDNCDPQWANRVAWFCYFNAGQT 

INQKQFQRLI^LGCGRVAIGGQSOTSDRyiAPTVLVDVQETEPVMQEEIFGPILPIVNVQSVD 
FITOQEKPLALYAFSNSSQVVNQMLERTSSGSFGGNEGFTYISLLSVPFGGVGHSGMGRYHGKFTFDT 
FSHHRTC PLAPSGLEKLKE IRYPPYTDWNQQLLRWGMGSQSCTLL 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 3B. 



Table 3B. Comparison of NOV3a against NOV3b and NOV3c. 


Protein Sequence 


NOV3a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV3b 


1..385 
L.343 


331/385 (85%) 
331/385(85%) 


NOV3c 


1..385 
1..385 


363/385 (94%) 
363/385 (94%) 



Further analysis of the NOV3a protein yielded the following properties shown in 
Table 3C. 



Table 3C. Protein Sequence Properties NOV3a 


PSort analysis: 


0.7900 probability located in plasma membrane; O.300O probability located in 
Golgi body; 0.2000 probability located in endoplasmic reticulum (membrane); 
0.1743 probability located in microbody (peroxisome) 


SignalP analysis: 


Cleavage site between residues 54 and 55 



A search of the NOV3a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 3D. 



Table 3D. Geneseq Results for NOV3a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV3a 

Residues/ 


Identities/ 
Similarities fnr 


Expect 
Value 
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: Fj 

Match 

Residues 


trti 1 '" ots'O-K- 

the Matched 
Region 


■•' .a. -i» > ■ 


AAB58156 


Lung cancer associated 
polypeptide sequence SEQ ID 
494 - Homo sapiens, 430 aa. 
[WO200055180-A2, 
21-SEP-2000] 


1..353 
62..414 


325/353 (92%) 
337/353(95%) 


0.0 


ABB66868 


Drosophila nielanogaster 
polypeptide SEQ ID NO 
27396 -Drosophila 
melanogaster, 561 aa. . 
[WO200171042-A2, 
27-SEP-2001] 


14..309 
95..390 


158/296 (53%) 
212/296(71%) 


3e-94 


ABB65492 


Drosophila nielanogaster 
polypeptide SEQ ID NO 
23268 -Drosophila 
melanogaster, 561 aa. 
[WO200171042-A2, 
27-SEP-2001] 


14..309 
95..390 


158/296(53%) 
212/296 (71%) 


3e-94 


ABP39856 


Staphylococcus epidermidis 
ORF amino acid sequence 
SEQIDNO:4701- 
Staphylococcus epidermidis, 
464aa.[US6380370-Bl, 
30-APR-2002] 


2..36S 
88..451 


157/366 (42%) 
235/366 (63%) 


le-85 

* 


AAG82730 


S. epidermidis open reading 
frame protein sequence SEQ 
ID NO.-2554 - Staphylococcus 
epidermidis, 459 aa. 
[WO200134809-A2, 
17-MAY-2001] 


2..365 
83..446 


157/366(42%) 
235/366 (63%) 


le-85 



In a BLAST search of public sequence datbases, the NOV3a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 3E. 



Table 3E. Public BLASTP Results for NOV3a 


Protein 

Accession 
Number 


Protein/Organism/Length 


NOV3a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


P48448 


Aldehyde dehydrogenase 8 
(EC 1.2.1.5) - Homo sapiens 
(Human), 385 aa. 


1..385 
1.385 


385/385 (100%) 
385/385 (100%) 


0.0 
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BAG03897 


CDNA F1J35145 fis, clone 
PLACE6009853, highly 
similar to ALDEHYDE 
DEHYDROGENASE 8 (EC 
1.2.1.5) - Homo sapiens 
(Human), 385 aa. 


1..385 
1..385 


381/385 (98%) 




P43353 


Aldehyde dehydrogenase 7 
(EC 1.2.1.5) - Homo sapiens 
(Human), 468 aa. 


1..385 
82..468 


321/387 (82%) 
345/387 (88%) 


0.0 


AAH33099 


Similar to aldehyde 
dehydrogenase 3 family, 
member B 1 - Homo sapiens \ 
(Human), 431 aa. 


13..385 
57..431 


315/375 (84%) 
339/375 (90%) 


0.0 


Q8VHW0 


Aldehyde dehydrogenase 
ALDH3B1 (EC 1.2.1.3)- 
Mus musculus (Mouse), 449 
aa (fragment). 


1..385 
63..449 


295/387(76%) 
336/387 (86%) 


e-174 



PFam analysis predicts that the NOV3a protein contains the domains shown in the 
Table 3F. 



Table 3F. Domain Analysis of NOV3a 


Pfam Domain 


NO V3a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


aldedh 


1..351 


129/492 (26%) 
299/492 (61%) 


l.le-103 



Example 4. 

The NOV4 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 4A. 



Table 4A. NOV4 Sequence Analysis 




SEQIDNO:25 1636 bp j 


NOV4a, 
CG120277-01 
DNA Sequence 


CC AGGAGCCCCAGTTACCGGG AGAGGCTGTGTCAAAGGCfinn ATG ARrzv A<2 A tt ^{zrnziCinnr'nrpn a a 


GCGCGCCCGCGCCGCCTTCAGCTCGGGCAGGACCCGTCCGCTGCAGTTCCGATTCCAGCAGCTGGAGG 
CGCTGCAGCGCCTGATCCAGGAGCAGGAGCAGGAGCTGGTGGGCGCGCTGGCCGCAGACCTGCACAAG 
AATGAATGGAACGCCTACTATGAGGAGGTGGTGTACGTCCTAGAGGAGATCGAGTACATGATCCAGAA 
GCTCCCTGAGTGGGCCGCGGATGAGCCCGTGGAGAAGACGCCCCAGACTCAGCAGGACGAGCTCTACA 
TCCACTCGGAGCCACTGGGCGTGGTCCTCGTCATTGGCACCTGGAACTACCCCTTCAACCTCACCATC 
CAGCCCATGGTGGGCGCCATCGCTGCAGGGAACGCAGTGGTCCTCAAGCCCTCGGAGCTGAGTGAGAA 
CATGGCGAGCCTGCTGGCTACCATCATCCCCCAGTACCTGGACAAGGATCTGTACCCAGTAATCAATG 
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GGGGTGTCCCTGAGACGACGGAGCTGCTCAAC^^ 

GGGGTGGGGAAGATCATCATGACGGC TG C TGCC AAGC AC CTG ACCC CTGTC ACGCTGGAGCTGGGAGG 
GAAGAGTCCCTGCTACGTGGACAAGAACTGTGACCTGGACGTGGCCTGCCGACGCATCGCCTGGGGGA 
AATTCATGAACAGTGGCCAGACCTGCGTGGCCCCAGACTACATCCTCTGTGACCCCTCGATCCAGAAC 
CAAATTGTGGAGAAGCTCAAGAAGTCACTGAAAGAGTTCTACGGGGAAGATGCTAAGAAATCCCGGGA 
CTATGGAAGAATCATTAGTGCCCGGCACTTCCAGAGGGTGATGGGCCTGATTGAGGGCCAGAAGGTGG 
CTTATGGGGGCACCGGGGATGCCGCCACTCGCTACATAGCCCCCACCATCCTCACGGACGTGGACCCC 

v-AVj ILL IajA a Vjl_ AAVjlA.Vj\jAUA L U a I tbWs^Clu 1 GCTGCCl* A 1X.\? Iaj 1 GCfcj I\aUGCAGCCTGGA 

GGAGGCCATCCAGTTCATCAACCAGCGTGAGAAGCCCCTOGCCCTCTACATGTTCTCCAGCAACGACA 
AGGTGATTAAGAAGATGATTGCAGAGAC ATCC AGTGGTGGGGTGGCGGCCAACG ATG TCATC GTCCAC 
ATCACCTTGCACTCTCTGCCCTTCGGGGGCGTGGGGAACAGCGGCATGGGATCCTACCATGGCAAGAA 
GAGCTTCGAGACTTTCTCTCACCGCCGCTCTTGCCTGGTGAGGCCTCTGATGAATGATGAAGGCCTGA 
AGGTCAGATAC CCCCCGAGCCCGGCCAAGATGACCCAGCACTGAGGAGGGGTTGCTC CGCC TGGCCTG 
GCCATACTGTGTCCCATCGGAGTGCGGACCACCCTCACTGGCTCTCCTGGCCCTGGAGAATCGCTCCT 




GCAGCCCCAGCCCAGCCCCACTCCTCTGCTGACCTGCTGACCTGTGCACACCCCACTCCCACATGGGC 


CC AGGC CTCACC ATTCC AAGTCTCCACCCCTTTC TAGAC C AATAAAGAGAC AAATAC AATTTTC TAAC 


TCGG 




ORF Start: ATG at 43 


jORFStop: TGAat 1402 





SEQIDNO:26 


453 aa jMW at 50412.5kD 


NOV4a, 
CG120277-01 
Protein 
Sequence 


MSKISEAVKRARAAFSSGRTRPLQFRFQQLEALQRLIQEQEQE^ 
EEIEYMIQKLPEWAADEPVEKTPQTQQDELYIHSEPLGVVLVIGTWOTPFl^ 

LKPSEL S ENMASLLAT 1 1 PQYIJDKDLYPVINGG VPETTELLKERFDHILYTG STGVGK I IMT AAAKHL 
TPVTLELGGKSPCYVDKNCDLDVACRRIAWGKFM 

GEDAKKSRDYGRI I SARHFQRVMGLIEGQKVAYGGTGDAATRYIAPTILTDVDPQSPVMQEEI FGPVL 

PIVCTOSLEEAIQFINQREKPLALYMFSSI!®^ 

GMGSYHGKKSFETFSHRRSCLVRPLMNDEGLKVRYPPSPAKMTQH 





SEQIDNO:27 j 1554 bp | 


NOV4b, 
CG120277-02 
DNA Sequence 


GAGCCCCAGTTACCGGGAGAGGCTGTGTCAAAGGCGCCATGAGCAAGATCAGCGAGGCCGTGAAGCG 


CGCCCGCGCCGCCTTCAGCTCGGGCAGGACCCGTCCGCTGCAGTTCCGGATCCAGCAGCTGGAGGCG 
CTGCAGCGCCTGATCCAGGAGCAGGAGCAGGAGCTGGTGGGCGCGCTGGCCGCAGACCTGCACAAGA 
ATGAATGGAACGCCTACTATGAGGAGGTGGTGTACGTCCTAGAGGAGATCGAGTACATGATCCAGAA 
GCTCCC TGAGTGGGCCGCGGATGAGCC CGTGGAGAAGACGCCCCAGAC TCAGCAGGACGAGCTCTAC 
ATCCACTCGGAGCCACTGGGCGTGGTCC^TCGTCATTGGCACCTGGAACTACCCCTTCAACCTCACCA 
TC C AGCCC ATGGTGGGCGCC ATCG CTGC AGGGAACGCAGTGGTCCTCAAGCCCTCGGAGCTGAGTGA 
GAACATGGCGAGCCTGCTGGCTAC(^TCATCCCCCAGTACCTGG^CAAGGATC0X5TACCCAGTAATC 
AATGGGGGTGTCCCTGAGACCACGGAGCTGCTCAAGGAGAGGTTCGACCATATCCTGTACACGGGCA 
GCACGGGGGTGGGGAAGATCATCATGACGGCTGCTGCCAAGCACCTGACCCCTGTCACGCTGGAGCT 
GGG AGGG AAGAGTC CCTGCT ACGTGGACAAGAACTGTGACC TGGACGTGGCCTGCCGACGC ATCGCC 
TGGGGGAAATTCATGAACAGTGGCCAGACCTGCGTGGCCCCAGACTACATCCTCTGTGACCCCTCGA 
TCCAGAACCAAATTGTGGAGAAGC TCAAGAAGTCACTGAAAGAGTTCTACGGGGAAGATGC TAAGAA 
ATCCCGGGACTATGGAAGAATCATTAGTGCCCGGCACTTCCAGAGGGTGATGGGCCTGATTGAGGGC 
CAGAAGGTGGCTTATGGGGGCACCGGGGATGCCGCCACTCGCTACATAGCCCCCACCATCCXCACGG 
ACGTGGACCCCCAGTCCCCGGTGATGCAAGAGGAGATCTTCGGGCCTGTGCTGCCCATCGTGTGCGT 
GCGCAGCCTGGAGGAGGCCATCCAGTTCATCAACCAGCGTGAGAAGCCCCTGGCCCTCTACATGTTC 
TCCAGCAACGACAAGGTGATTAAGAAGATGATTGCAGAGACATCCAGTGGTGGGGTGGCGGCCAACG 
ATGTCATCGTCCACATCACCTTGCACTCTCTGCCCTTCGGGGGCGTGGGGAACAGCGGCATGGTGAG 
GCCTCTGATGAATGATGAAGGCCTGAAGGTGAGATACCCCCCGAG 

TGAGGAGGGGTTGCTCCGTCTGGCCTGGCCATACTGTGTCCCATCGGAGTGCGGACCACCCTCACTG 


GCTCTCCTGGCCCTGGGAGAATCGCTCCTGCAGCCCCAGCCCAGCCCCACTCCTCTGCTGACCTGCT 


GACCTGTGCACACCCCACTCCCACATGGGCCCAGGCCTCACCATTCCAAGTCTCCACCCCTTTCTAG 






ORF Start: ATG at 39 j |ORF Stop: TGA at 1341 
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SEQIDNO:28 j434aa |MW at 48169 OkD 


NOV4b, 

CG120277-02 

Protein 

Sequence ; 





Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 4B. 



Table 4B. Comparison of NOV4a against NOV4b. 


Protein Sequence 


NOV4a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV4b 


1..453 
1..434 


401/453 (88%) 
401/453 (88%) 



Further analysis of the NOV4a protein yielded the following properties shown i: 
Table 4C. 



Table 4C Protein 


Sequence Properties NOV4a 


PSort analysis: 


0.7636 probability located in mitochondrial matrix space; 0.4422 probability 
located in mitochondrial inner membrane; 0.4422 probability located in 
mitochondrial intermembrane space; 0.4422 probability located in 
mitochondrial outer membrane 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV4a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 4D. 



Table 4D. Genes 


;eq Results for NOV4a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV4a 
Residues/ 

Match 


Identifies/ 
Similarities for 
thp. Mafrhefl 


Expect 
Value 
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# , 


jp 

Residues 


Region 


..IS „iL .ilt y „ 


AAB58156 


Lung cancer associated 
polypeptide sequence SEQ ID 
494 - Homo sapiens, 430 aa. 

1 W UZUULOD I OU-AZ, 

21-SEP-2000] 


48..431 
28..411 


208/384 (54%) 
277/384(71%) 


e-124 


ABB66868 


Drosophila melanogaster 
polypeptide SEQ ID NO 
27396 -Drosophila 
melanogaster, 561 aa. 
[ WU20U 17 1U42-A2, 
27-SEP-2001] 


1..394 
1..394 


199/394 (50%) 
270/394 (68%) 


e-115 


ABB65492 


Drosophila melanogaster 
polypeptide SEQ ID NO 
23268 -Drosophila 
melanogaster, 561 aa. 
[WO200171042-A2, 
27-SEP-2001] 


1..394 
1..394 


199/394 (50%) 
270/394 (68%) 


e-115 


AAG21988 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 24747 
- ATduiuopsis mail ana, ^fo^t 
aa. [EP1033405-A2, 
06-SEP-2000] 


2..445 
10..456 


210/449 (46%) 
288/449 (63%) 


e-112 


AAGU789 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 10644 
- Arabidopsis thaliana, 484 
aa. [EP1033405-A2, 
06-SEP-2000] 


2..445 
10..456 


210/449 (46%) 
288/449 (63%) 


e-112 



In a BLAST search of public sequence datbases, the NOV4a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 4R 



Table 4E. Public BLASTP Results for NOV4a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV4a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


P30838 


Aldehyde dehydrogenase, 
dimeric NADP-preferring (EC 
1.2.1.5) (ALDH class 3) 
(ALDHUI) - Homo sapiens 
(Human), 453 aa. 


1..453 
L.453 


453/453 (100%) 
453/453 (100%) 


0.0 
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Q9BT37 . 


Aldehyde dehydrogenase 3 
(Aldehyde dehydrogenase 3 
family, member Al) - Homo 
sapiens (Human), 453 aa. 


or 

1..453 
1..453 


i* L= ft / ''!J tilt Lit 
452/453 (99%) 

452/453 (99%) 


•4M.ll* » 1* 

o.o 




A42584 


aldehyde dehydrogenase 
(NAD(P)+) (EC 1.2.1.5) 3- 
hnman, 453 aa. 


1..453 
1..453 


450/453 (99%) 
451/453(99%) 


0.0 




A30149 


aldehyde dehydrogenase 
(NADP+) (EC 1.2.1.4) 3, 
tumor-associated [similarity] - 
rat, 453 aa. 


1..453 
1..453 


370/453 (81%) 
415/453 (90%) 


0.0 




PI 1 883 


rviuciiyLic ueiiyurogenase, 
dimeric NADP-preferring (EC 
1.2.1.5) (ALDH class 3) 
(Tumor-associated aldehyde 
dehydrogenase) 
(HTC-ALDH) - Rattus 
norvegicus (Rat), 452 aa. 


1..452 


JOy/4jZ \pi70) 

414/452(90%) 


u.u 





PFam analysis predicts that the NOV4a protein contains the domains shown in the 
Table 4F. 

5 



Table 4F. Domain Analysis of NOV4a 


Pfam Domain 


NOV4a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


aldedh 


1..432 


182/492(37%) 
401/492 (82%) 


7.4e-206 



Example 5. 

10 The NOV5 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 5A. 



Table 5A, NO V5 Sequence Analysis 




SEQIDN0.29 (2316 bp 


NOV5a, 
CG140468-01 
DNA Sequence 


GCCACGAAGGCCACAGACGCCTTCCCCCTTGGACTCTCATTCCCTTTTCCACGGAGCCCCGCGCTTTC 


GTGAGCCC C CTCGAGG AACCTGGTCTCCGCATCCAGTTACC ACCTCCTGCC TCAGAGGCCATCTGAGC 


CCTTCGCACCTCGCCCCTCAGTCCCCCCTTGCCCCCCCGCGGAGATCGCCTCGCTCCCTCCCGCCCCC 


CCATCATCCCTTCCCTCGCAGTTCCCCTGTCCTGAGGGGAGCCCCGCCACGGCAGCGACAGCGGGCAG 


GAGGGAGAAAGTGAAGGTTGGGCGACACTTGGC CTCACTCCCGGC TAGGCGC ACCCACGGGGAGGAGA 


GGAGGAGCCGAGAGAGCTGAGCAGCGCGGAAGTAGCTGCTGCTGGTGGTGACAATGTCAAATAACGGC 


CTAGACATTCAAGACAAACCCCCAGCCCCTCCGATGAGAAATACCAGCACTATGATTGGAGTCGGCAG 
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CAAAGATGCTGGAACCCTAA&CCATGGTT 

AGGACCG ATTTTACCG ATCCATTT TACC TGGAGATAAAAC AAATAAAAAGAAAGAGAAAGAGCGGC C A 
GAGATTTCTCTCCC TTC AGATTTTGAACACACAATTCATGTCGGTTTTGATGC TGTCAC AGGGGAGTT 
TACGGGAATGCCAGAGCAGTGGGCCCGCTTGCTTCAGACATCAAATATCACTAAGTCGGAGCAGAAGA 
AAAACCCGCAGGCTGTTCTGGATGTCTTGGAGTTTTACAACTCGAAGAAGACATCCAACAGCCAGAAA 
TACATGAGCTTTACAGATAAGTCAGCTGAGGATTACAATTCTTCTAATGCCTTGAATGTGAAGGCTGT 
GTCTGAGACTCCTGCAGTGCCACCAGTTTCAGAAGATGAGGATGATGATGATGATGATGCTACCCCAC 
CACCAGTGATTGCTCCACGCCCAGAGCACACAAAATCTGTATACACACGGTCTGTGATTGAACCACTT 
CCTGTCACTCCAACTCGGGACGTGGCTACATCTCCCATTTC ACC TAG TGAAAATAACACCACTCC ACC 
AGATGCTTTGACCCGGAATACTGAGAAGCAGAAGAAGAAGCCTAAAATGTCTGATGAGGAGATCTTGG 
AGAAATTACGAAGCATAGTGAGTGTGGGCGATCCTAAGAAGAAATATACACGGTTTGAGAAGATTGGA 
CAAGGTGCTTCAGGCACCGTGTACACAGCAATGGATGTGGCCACAGGACAGGAGGTGGCCATTAAGCA 
GATGAATCTTCAGCAGCAGCCO^AGAAAGAGCTCATTATTAATGAGATCCTGGTCATGAGGGAAAACA 
AGAACCC AAACATTGTGAATTACTTGGACAGTTACCTCGTGGGAGATG AGC TGTGGGT TGTTATGG AA 
TACTTGGCTGGAGGCTCCTTGACAGATGTGGTGACAGAAACTTGCATGGATGAAGGCCAAATTGCAGC 
TGTGTGCCGTGAGTGTCTGCAGGCTCTGGAGTTCTTGCATTCGAACCAGGTCATTCACAGAGACATCA 
AGAGTGACAATATTCTGTTGGGAATGGATGGCTCTGTC AAGCTAACTGAC T TTGGATTCTGTGC ACAG 
ATAACCCCAGAGCAGAGCAAACGGAGCACCATGGTAGGAACCCCATACTGGATGGCACCAGAGGTTGT 
GACACGAAAGGCCTATGGGCCCAAGGTTGACATCTGGTCCCTGGGCATCATGGCCATCGAAATGATTG 
AAGGGGAGCCTCCATACCTCAATGAAAACCCTCTGAGAGCCTTGTACCTCATTGCCACCAATGGGACC 
CCAGAACTTCAGAACCCAGAGAAGCTGTCAGCTATCTTCCGGGACTTTCTGAACCGCTGTCTCGATAT 
GGLA.TGTGGAGAAG AGAG GTTC AGCTAAAG AGCTGC TACAGCATC AATTCC TG AAGA C C 
TCTCCAGCCTCACTCCACTGATTGCTGCAGCTAAGGAGGCAACAAAGAACAATCACTAAAACCACACT 
CACCCCAGCCTCATTGTGCCAAGCTCTGTGAGATAAATGCACATTTCAGAAATTCCAACTCCTGATGC 


CCTCTTCTCCTTGCCTTGCTTCTCCCATTTCCTGATCTAGCACTCCTCAAGACTTTGATCCT0X5GAAA 


CCGTGTGTCCAGCATTGAAGAGAACTGCAACTGAATGACTAATCAGATGATGGCCATTTCTAAATAAG 


GAATTTCCTCCCAATTCATGGATATGAGGGTGGTTTATGATTAAGGGTTTATATAAATAAATGTTTCT 


AGTC 




ORF Start: ATG at 394 j ORF Stop: TAA at 2029 





SEQ ID NO: 30 |545 aa |MW at 60660.3kD 


NOV5a, 
CG140468-01 
Protein 
Sequence 


MSNNGLD I QDK PPAPPMRNTS TMIGVGSKDAGTLNHGSKPLPPNPEEKKKKDRF YRS IIiPGDKTMKKK 

EKER PEI SL P SDFEHTIHVGFDAVTGEFTGMPEQWARLLQTSNI TKSEQKKNPQAVLDVLEFYNSKKT 

SNSQKYMSFTDKSAEDYNSSNALNVKAVSETPAVPPVSEDEDDDDDDATPPPVIAPRPEHTKSVYTRS 

VIEPLPVTPTRDVATS PI SPTENNTT PPDALTRNTEKQKKKPKMSDEEILEKLRS I VSVGDPKKK YTR 

FEKIGQGASGTVYTAMDVATGQEVAIKQMNLQQ 

WVVMEYLAGGSLTDVVTETO^E^ 

GFCAQITPEQSKRSTMVGTPYWMAPEWTRKAYGPKVDIWSLGIM^ 

ATNGTPELQNPEKLSAIFRDFI^CLD^V^ 

H 





SEQ ID NO: 31 |957bp j 


NOV5b, 

CG140468-02 
DNA Sequence 


GACAATGTCAAATAACGGCCTAGACATTCAAGACAAACCCCCAGCCCCTCCGATGAGAAATACCAGC 
ACTATGATTGGAGCCGGCAGCAAAGATGCTGGAACCCTAAACCATGGTTCTAAACCTCTGCCTCCAA 
ACCCAGAGGAGAAGAAAAAGAAGGACCGATTTTACCGATCCATTTTACCTGGAGATAAAACAAATAA 
AAAGAAAGAGAAAGAGCGGCCAGAGATTTCTCTCCCTTCAGATTTTGAACACACAATTCATGTCGGT 
TTTGATGCTGTCACAGGGGAGTTTACGGGAATGCCAGAGCAGTGGGCCCGCTTGCTTCAGACATCAA 
ATATCACTAAGTCGGAGCAGAAGAAAAACCCGCAGGCTGTTCTGGATGTGTTGGAGTTTTACAACTC 
GAAGAAGACATCCAACAGCCAGAAATACATGAGCTTTACAGATAAGTCAGCTGAGGATTACAATTCT 
TCTAATGCC TTGAATGTGAAGGCTGTGTCTGAGAC TCCTGCAGTGCCACC AGTTTCAGAAGATGAGG 
ATGATGATGATGATGATGCTACCCCACCACCAGTGATTGCTCCACGCCCAGAGCACACAAAATCTGT 
ATACACACGGTCTGTGATTGAACCACTTCCTGTCACTCCAACTCGGGACGTGGCTACATCTCCCATT 
TCACCTACTGAAAATAACACCACTCCACCAGATGCTTTGACCCGGAATACTGAGAAGCAGAAGAAGA 
AGCCTAAAATGTCTGATGAGGAGATCTTGGAGAAATTACGAAGCATAGTGAGTGTGGGCGATCCTAA 
GAAGAAATATAC^CGGTTTGAGAAGATTGCCAAGCCCCTCTCCAGCCTCACTCCACTGATTGCTGCA 
GCTAAGGAGGCAACAAAGAACAATCACTAAAACCACACTCACCCCAGCCTCATTGTGCCAAGCCTTC 


TGTGAGATAAATGCACATT 




ORF Start: ATG at 5 | 


ORF Stop: TAA at 899 
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SEQlDNO:32 


298 aa jMW at 32989.7kD 


NOV5b, 
CG 140468-02 
Protein 
Sequence 


MSNNGLDIQDKPPAPPMimTSTMIGAGSKDAGTLOTGSKPLPPNPEEKKKKDRF 
KEKERPEISLPSDFEHTIHVGFDAVTGEFTGMPEQWA 

KTSNSQKYMS FTDKS AEDYNSSNALNVKAVSET PAVPPVSEDEDDDDDDATPPPVI APRPEHTKSVY 
TRSVI EPL PVTPTRDVATSP I SPTENOTTPPDALTRNTKKQKKKPKMSDEEILEKLRS I VSVGDPKK 
KYTRFEKIAKPLSSLTPLIAAAKEATKNNH 



5 

Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 5B. 

10 



Table 5B. Comparison of NOV5a against NOV5b. 


Protein Sequence 


NOV5a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV5b 


1..281 
1..281 


238/281 (84%) 
239/281 (84%) 



Further analysis of the NOV5a protein yielded the following properties shown in 
Table 5C. 
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Table SC. Protein Sequence Properties NOV5a 


PSort analysis: 


0.7000 probability located in nucleus; 0.3000 probability located in 
microbody (peroxisome); 0.1000 probability located in mitochondrial matrix 
space; 0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV5a protein against the Geneseq database, a proprietary 
20 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 5D. 



Table 5D. Geneseq Results for NOVSa 


Geneseq 
Identifier 


Protein/ O rganism/Length 
[Patent#, Date] 


NOV5a 

Residues/ 

Match 


Identities/ 
Similarities for 
the Matched 


Expect 
Value 
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~ p 

Residues 






AAB03968 


p-21 activated protein kinase 
(PAK1) - Homo sapiens, 545 
aa. [WO200060062-A2, 
12-OCT-2000] 


1..545 
1..545 


544/545 (99%) 
545/545 (99%) 


0.0 


AAY55958 


Human STE20-related 
protein kinase PAKl_h - 
Homo sapiens, 545 aa. 
[WO9953036-A2, 
21-OCT-1999] 


1..545 
1..545 


541/545 (99%) 
542/545 (99%) 


0.0 


ABG30251 


Novel human diagnostic 
protein #30242 - Homo 
sapiens, 587 aa. 
[WO200175067-A2, 
ll-OCT-2001] 


1..542 
7..557 


474/556 (85%) 
500/556 (89%) 


0.0 


AAW72757 


Human doublin - Homo 
sapiens, 544 aa. 
[WO9840495-A1, 
17-SEP-1998] 


3..544 
2..542 


444/552 (80%) 
489/552 (88%) 


0.0 


ABB57290 


Mouse ischaemic condition 
related protein sequence SEQ 
ID NO:817 - Mus musculus, 
544 aa. [WO200188188-A2, 
22-NOV-2001] 


3..544 
2..542 


441/552 (79%) 
483/552 (86%) 


0.0 



In a BLAST search of public sequence datbases, the NOV5a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 5E. 



Table 5E. Public BLASTP Results for NOV5a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVSa 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q13153 


Serine/threonine-protein - 
kinase PAK 1 (EC 2.7.1.-) 
(p21-activated kinase 1) 
(PAK-1) (P65-PAK) 
(Alpha-PAK) - Homo sapiens 
(Human), 545 aa. 


1..545 
1..545 


545/545 (100%) 
545/545(100%) 


0.0 
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oci mc/ iiireuiiiiic-tJi uicm 
kinase PAK1 (EC 2.7.1.-) 
(p21-activated kinase 1) 
(PAK-1) (P68-PAK) 
(Alpha-PAK) (Protein kinase 
MUK2) - Rattus norvegicus 
(Rat), 544 aa. 


7w> 1 

1..544 


539/545 (98%) 




S40482 


serine/threonine-specific 
protein kinase (EC 2.7.1.-) - 
rat, 544 aa. 


1..545 
1..544 


534/545 (97%) 
537/545 (97%) 


0.0 


L/OOvrtJ 


OCX iUC/ Ull CUIllllC-pi ULCill 

kinase P AK 1 (EC 2.7.1.-) 
(p21-activated kinase 1) 
(PAK-1) (P65-PAK) 
(Alpha-PAK) (CDC42/RAC 
effector kinase PAK-A) - Mus 
musculus (Mouse), 545 aa. 


1 

1..545 


537/545 (98%) 




075561 


P21 activated kinase IB - 
Homo sapiens (Human), 553 
aa. 


1..522 

1..522 ! 


517/522(99%) 
520/522(99%) 


0.0 



PFam analysis predicts that the NOV5a protein contains the domains shown in the 
Table 5F. 



Table 5F. Domain Analysis of NOV5a 


Pfam Domain 


NOV5a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


PBD 


75..135 


37/64(58%) 
59/64(92%) 


3.4e-34 


pkinase 


270..521 


94/291 (32%) 
208/291 (71%) 


5.7e-90 



Example 6. 

The NOV6 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 6A. 



Table 6A. NOV6 Sequence Analysis 



SEQIPNO: 33 



3255 bp 
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NOV6a, 
CG142182-01 
DNA Sequence 



GACAGCTTTGGGTGGACCAGTAATGAGGAAATGAGGi 



CTTCAGCGC TTTGGAAAC TTC TTTAGTTGGGACCTCCGGTCATGAC CTCpATCTATCGTCTGTACC ATG 
GAACCATTGTTAACCAGATTGTTTGTAAAGAATGTAAGAACGTTAGCGAGAGGCAGGAAGACTTCTTA 
GATCTAACAGTAGCAGTCAAAAATGTATCCGGTTTGGAAGATGCTCTCTGGAACATGTATGTAGAAGA 
GGAAGTTTTTGATTGTGACAACTTGTACCACTGTGGAACTTGTGACAGGCTGGTTAAAGCAGCAAAGT 
CGGCC AAATT ACGTAAGCTGCCTCCTTTTCTTACTGTTTCATTAC TAAGATTTAATTTTGAT T TTGTG 
AAATGC GAACGCTACAAGGAAACTAGCTGTTATACATTCCCTCTC CGGATTAATCTCAAGCCC TTTTG 
TGAACAGAGTGAATTGGATGACTTAGAATATATATATGACCTCTTCTCAGTTATTATACACAAAGGTG 
GCTGCTACGGAGGCCATTACCATGTATATATTAAAGATGTTGATCATTTGGGAAACTGGCAGTTTCAA 
GAGGAAAAAAGTAAAC CAG ATGTGAATC TG AAAGATC TCCAGAGTGAAGAAGAG ATTGATC ATCCACT 
GATGATTCTAAAAGCAATC TTATTAGAGGAGGAGAATAATCTAATTCC TGTTGATC AGCTGGGCCAGA 
AAC TTTTGAAAAAGAT AGGAATATCT TGGAACAAGAAGTAC AGAAAAC AG C AT GGACCATTGCGGAAG 
TTCTTACAGCTCCATTCTCAGATATTTCTACTCAGTTCAGATGAAAGTACAGTTCGTCTCTTGAAGAA 
TAGTTCTCTCCAGGCTGAGTCTGATTTC CAAAGGAATGACCAGCAAATTTTC AAGATGC TTCCTCCAG 
AATCCC C AGGTTTAAACAATAGC ATCTC CTGTCCCCAC TGGTTTGATATAAATGATTCTAAAGTCC AG 
CCAATCAGGGAAAAGGATATTGAACAGCAATTTCAGGGTAAAGAAAGTGCCTACATGTTGTTTTATCG 
GAAATCCCAGTTGCAGAGACCCCCTGAAGCTCGAGCTAATCCAAGATATGGGGTTCCATGTCATTTAC 
TGAATGAAATGGATGCAGCTAACATTGAACTGCAAACCAAAAGGGCAGAATGTGATTCTGCAAACAAT 
ACTTTTGAATTGCATCTTCACCTGGGCCCTCAGTATCATTTCTTCAATGGGGCTCTG(^CC 
CTC TCAAAGAGAAAGCGTGTGGGATTTG AC CTTTGATAAAAGAAAAACTTTAGG 
CAATATTTCAGCTGTTAGAATTTTGGGAAGGAGACATGGTTCTTAGTGTTGCAAAGCTTGTACCAGCA 
GGACTTCACATTTACCAGTCACTTGGCGGGGATGAACTGACACTGTGTGAAACTGAAATTGCTGATGG 
GGAAGACATCTTTGTGTGGAATGGGGTGGAGGTTGGTGGAGTCCACATTCAAACTGGTATTGACTGCG 
AACC TC TACTTTTA7VATG TTC TTCATCTAGAC ACAAGCAGTGATGG AGAAAAGTGTTGTCAGGTGAT A 
GAATCTCCACATGTCTTTCCAGCTAATGCAGAAGTGGGCACTGTCCTCACAGCCTTAGCAATCCCAGC 
AGGTGTCATCTTCATCAACAGTGCTGGATGTCCAGGTGGGGAGGGTTGGACGGCCATCCCCAAGGAAG 
ACATGAGGAAGACGTTCAGGGAGCAAGGGCTCAGAAATGGAAGCTCAATTTTAATTCAGGATTC 
GATGATAACAGCTTGTTGACCAAGGAAGAGAAATGGGTCACTAGTATGAATGAGATTGACTGGCTCCA 
CGTTAAAAATTTATGCC^GTTAGAATCTGAAGAGAAGCAAGTTAAAATATCAGC^CTGTTAACAC^ 
TGGTGTTTGATAT TCGAATTAAAGCC ATAAAGGAATTAAAATTAATGAAGGAAC TAGCTGACAACAGC 
TGTTTGAGACCTATTGATAGAAATGGGAAGCTTCTTTGTCCAGTGCCGGACAGCTATACTTTGAAGGA 
AGCAGAATTGAAGATGGGAAGTTCATTGGGACTGTGTCTTGGAAAAGCACCAAGTTCGTCTCAGTTGT 
TCCTGTTTTTTGCAATGGGGAGTGACGTTCAACCTGGGACAGAAATGGAAATCGTAGTAGAAGAAACA 
ATATCTGTGAGAGATTGTTTAAAGTTAATGCTGAAGAAATCTGGCCTAC AAGAC TCC TTTATAGGAGA 
TGCCTGGC ATTTACGAAAAATGG ATTGGTGCTATGAAGC TGGAGAGC CTTTATGTGAAGAAGATGCAA 
CACTGAAAGAACTTCTGATATGT TC TGGAGATACTTTGCTTTTAATTGAAGGAC AAC TTCC TCCTCTG 
GGTTTC CTGAAGGTGC CCATCTGGTGGTACCAGC TTC AGGGTCCC TCAGGACACTGGGAGAGTCATC A 
GGACCAGACCAACTGTACTTCGTCTTGGGGCAGAGTTTGGAGAGCCACTTCCAGCCAAGGTGCTTCTG 
GGAACGAGCCTGCGCAAGTTTCTCTCCTCTACTTGGGAGACATAGAGATCTCAGAAGATGCCACGCTG 
GCGGAGCTGAAGTCTCAGGCCATGACCTTGCCTCCTTTCCTGGAGTTCGGTGTCCCGTCCCCAGCCCA 
CCTCAGAGCCTGGACGGTCGAGAGGAAGCGCCCAGGCAGGCTTTTACGAACTGACCGGCAGCCACTCA 
GGGAATATAAACTAGGACGGAGAATTGAGATCTGCTTAGAGCCCCTTCAGAAAGGCGAAAACTTGGGC 
CCCCAGGACGTGCTGCTGAGGACACAGGTGCGCATCCCTGGTGAGAGGACCTATGCCCCTGCCCTGGA 
CCTGGTGTGGAACGCGGCCCAGGGTGGGACTGCCGGCTCCCTGAGGCAGAGAGTTGCCGATTTCTATT 
GTCTTCCCGTGGAGAAGATTGAAATTGCCAAATACTTTCCCGAAAAGTTCGAGTGGCTTCCGATATCT 
AGCTGGAACCAACAAATAACCAAGAGGAAAAAAAAAAAAAAACAAGATTATTTGCAAGGGGCACCGTA 
TTACTTGAAAGACGGAGATACTATTGGTGTTAAGGTAAGTTGTTTAACAGCAAATTTACCACTTTGAG 
AAGACACGAGGGTCACATGATTTTATAGAGACGTTTTATTGAATCTTCAAGACACAGAT 



[ORF Stop: TGA at 3193 



ORF Start: ATG at 31 





SEQ ID NO: 34 


1054 aa 


MWatll9613.5kD 


NOV6a, 
CG142182-01 
Protein 
Sequence 


MRQHDVQELNRILFSALETSLVGTSGHDLIYRLYHGTIVNQIVCKECKNVSERQEDFLDLW 
GLEDALWNMYVEEEVFDCDNLYHCGTCDRLVKAAKSAKLR^ 
YTFPLRINLKPFCEQSELDDLEYIYDLFSVIIHKGGCYGGHYHVYIKD^ 
KDLQSEEEIDHPLMILKAILLEEENNLIPVDQLGQK^^ 

LSSDESTVRLLKNSSLQAESDFQROTQQIFKMjPPESPGLOTSISCPHWFDINDSKVQPIREKDIEQQ 
FQGKESAYl^FYRKSQLQRPPEARANPRYGVPCHLLNE^i^ 

QYHFFNGALHPWSQTESVWDLTFDKRKTLGDLRQSIFQLLEFWEGDMVLSVAKLVPAGLHIYQSLGG 
DELTLCETEIADGEDIFVV^GVEVGGVHIQTGIDCEPLL^ 

EVGTVLTALAIPAGVIFINSAGC PGGEGWTAIPKEDMRKTFREQGLRNGSSIL IQDSHDDNSLLTKEE 
KWOTSMNEIDWIiHVKI^CQLESEEKQVKISATVNTMTOD 

LLCFVPDSYTLKEAELKMGS SLGLCLGKAPS SSQLFLFFAMGSDVQPGTEMEIWEETISVRDCLKLM 
LKKSGLQDSFIGDAWHLRKMDWCYEAGEPL^^ 

QLQGPSGHWESHQDQTNCTSSWGRVWRATSSQGASGNEPAQVSLLYliGDIEISEDATLAELKSQAMTL 
PPFLEFGVPSPAHLRAWTVERKRPGRLLRTDRQPLREYKLGRRIEICLEPLQKGEI^GPQDVLLRTQV 
RIPGERTYAPALDLVWNAAQGGTAGSLRQRVADFYCLPVEKIEIAKYFPEKFEWLPISSWNQQITKRK 
KKKKQDYLQGAPYYLKDGDTIGVKVSCLTANLPL 



126 



WO 03/029424 PCT/US02/31373 



Further analysis of the NO V6a protein yielded the following properties shown in 
Table 6B. 
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Table 6B. Protein Sequence Properties NOV6a 


PSort analysis: 


0.7000 probability located in plasma membrane; 0.3500 probability located in 
nucleus; 0.3000 probability located in microbody (peroxisome); 0.2000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV6a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
10 several homologous proteins shown in Table 6C. 



Table 6C. Geneseq Results for NOV6a 


Geneseq 
Identifier 


Protein/Oiganism/Length 
[Patent #, Date] 


NOV6a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAE14346 


Human protease PRTS-1 1 
protein - Homo sapiens, 
1108 aa. 

[WO200183775-A2, 
08-NOV-2001] 


1..1044 
1..1040 


1037/1044(99%) 
1037/1044 (99%) 


0.0 


AAU68535 


Human novel cytokine 
encoded by cDNA 
790CIP2C_6#1-Homo 
sapiens, 1346 aa. 
[WO200175093-A1, 
ll-OCT-2001] 


1..1044 
129..1167 


1037/1044(99%) 
1038/1044 (99%) 


0.0 


AAB93169 


Human protein sequence 
SEQ ID NO: 12102 - Homo 
sapiens, 1014 aa. 
[EP1074617-A2, 
07-FEB-2001] 


1..1019 
1..1014 


1013/1019 (99%) 
1013/1019 (99%) 


0.0 


AAU68534 


Human novel cytokine 
encoded by cDNA 
790CIP2C_5#1-Homo 
sapiens, 1324 aa. 
[WO200175093-A1, 
ll-OCT-2001] 


1..1044 
129.. 1145 


1015/1044(97%) 
1015/1044(97%) 


0.0 
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ABG27066 


Novel human diagnostic 


500..666 11 








protein #27057 - Homo 


47..214 


166/168 (98%) 






sapiens, 674 aa. 










[WO200175067-A2, 










ll-OCT-2001] 









In a BLAST search of public sequence datbases, the NOV6a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 6D. 



Table 6D. Public BLASTP Results for NOV6a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV6a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q9NVE5 


CDNA FLJ10785 fis, clone 
NT2RP4000457, weakly 
similar to ubiquitin 
carboxyl-terminal hydrolase 
15 (EC 3.1:2.15) -Homo 
sapiens (Human), 1014 aa 
(fiagment). 


1..1019 
1..1014 


1013/1019 (99%) 
1013/1019(99%) 


0.0 


Q95KB6 


Hypothetical 102.2 kDa 
protein - Macaca fascicularis 
(Crab eating macaque) 
(Cynomolgus monkey), 907 
aa (fragment). 


143.. 1024 
30..907 


844/882 (95%) 
860/882 (96%) 


0.0 


Q8S1J6 


Putative ubiquitin 
carboxyl-terminal hydrolase 
- Oryza sativa (japonica 
cultivar-group), 1079 aa. 


3.-342 
223-568 


102/359 (28%) 
165/359 (45%) 


3e-23 


Q8VZM4 


Putative ubiquitin 
carboxyl-terminal hydrolase 
- Arabidopsis thaliana 
(Mouse-ear cress), 683 aa. 


3..202 
278..480 


72/205(35%) 
105/205 (51%) 


3e-23 


Q94ED6 


Putative ubiquitin j 
carboxyl-terminal hydrolase 
- Oryza sativa (Rice), 1108 
aa. 


3..342 
273..618 


102/359 (28%) 
165/359 (45%) 


3e-23 



PFam analysis predicts that the NOV6a protein contains the domains shown in the 
Table 6E. 
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Table 6E. Domain Analysis of NOV6a 


Pfam Domain 


NOV6a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


UCH-2 


157.354 


23/203(11%) 
141/203 (69%) 


0.00033 



5 Example 7. 

The NOV7 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 7A. 



Table 7A. NOV7 Sequence Analysis 




SEQ ID NO: 35 |692 bp j 


NOV7a, 
CG142564-01 
DNA Sequence 


GACAGGAGTGAACCCGAGCTGTGCCGACCAACCCCCAGGATGGCGGAAGCTCACCAGGCCGTGGCCTT 


CCAGTTCACGGTGACCCCAGACGGGGTCGACTTCCGGCTCAGTCGGGAGGCCCTGAAACACGTCTACC 

TGTC TGGGATCAACTCCTGGAAGAAACGCCTGATCCGCATCAAGAATGGCATCC TCAGGGGCGTGTAC 

CCTGGCAGCCCCACCAGCTGGCTGGTCGTCATCATGGTAACAGTGGGTTCCTCCTTCTGCAACGTGGA 

CATCTCCTTGGGGCTGGTCAGTTGCATCCAGAGATGCCTCCCTCAGGGGTGTGGCCCCTACCAGACCC 

CGCAGACCCGGGCACTTCTCAGCATGGCCATCTTCTCCACGGGCGTCTGGGTGACGGGCATCTTCTTC 

TTCCGCCAAACCCTGAAGCTGCTTCTCTGCTACCAATCCCAGATC CGCATGTTC GACCCAGAGCAGCA 

CCCCAATCACCTGGGCGCTGGAGGTGGCTTTGGCCCTGTAGCAGATGATGGCTAT^ 

TGATTGCAGGCGAGAACACGATCTTCTTCCACATCTCCAGCAAGTTCTCAAGCTCAGAGACGAACGCC 

CAGCGCTTTGGAAACCACATCCGCAAAGCCCTGCTGGACATTGCTGATCTTTTCCAAGTTCCTCAGGC 

CTACAGCTGAAG 




ORF Start: ATG at 40 | |ORF Stop: TGA at 688 
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SEQ ID NO: 36 


216 aa |MW at 23874.3kD 


NOV7a, 
CG142564-01 
Protein 
Sequence 


MAEAHQAVAFQFTVTPDGVDFRLSREALKHV^ 

TVGSSFCNVDISLGLVSCIQRCLPQGCGPYQTPQTRALLSMAIFSTGVWVTGIFFFRQTLKLLLCYQS 

QIRMFDPEQHPNHLGAGGGFGPVADDGYGVSYMIAGENTIFFHISSKFSSSETNAQRFGNHIR 

IADLFQVPQAYS 



15 

Further analysis of the NOV7a protein yielded the following properties shown in 
Table 7B. 

20 . 

Table 7B. Protein Sequence Properties NOV7a 
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PSort analysis: 


0.7900 probability located in plasma membrane; tf.G^OtiTprolb^^ " 
microbody (peroxisome); 0.3000 probability located in Golgi body; 0.2000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 5 and 6 



A search of the NOV7a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 7C. 



Table 7C. Geneseq Results for NOV7a 


Geneseq 

Trlentifiei* 

lUVll Ullvi 


Protein/Organism/Length 
ITatent #. Datel 


NOV7a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAW14438 


Type I carnitine palmitoyl 
transferase-like protein - 
Homo sapiens, 772 aa. 
[JP09009969-A, 
14-JAN-1997] 


1..134 
1..134 


131/134 (97%) 
131/134 (97%) 


4e-72 


AAE10322 


Human carnitine 
acyltransferase, 26886 - 
Homo sapiens, 803 aa. 
[WO200166759-A2, 
13-SEP-2001] 


1..134 
L.132 


57/134(42%) 
78/134(57%) 


le-21 


AAY79220 


Human transferase 
TRNSFS-12 - Homo sapiens, 
803 aa. [WO200014251-A2, 
16-MAR-2000] 


1..134 
1..132 


57/134(42%) 
78/134(57%) 


le-21. 


ABB67527 


Drosophila melanogaster 
polypeptide SEQ ED NO 
29373 - Drosophila 
melanogaster, 780 aa. 
[WO200171042-A2, 
27-SEP-2001] 


137..210 
688..761 


43/74(58%) 
55/74(74%) 


6e-19 


ABB66942 


Drosophila melanogaster 
polypeptide SEQ ID NO 
27618 -Drosophila 
melanogaster, 782 aa. 
[WO200171042-A2, 
27-SEP-2001] 


137..210 
690..763 


43/74(58%) 
55/74(74%) 


6e-19 
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In a BLAST search of public sequence datbases, tffe NOV-7a^M'ft {! was fStaftK^ 
have homology to the proteins shown in the BLASTP data in Table 7D. 



Table 7D. Public BLASTP Results for NOV7a 


Protein 

Accession 

Number 


T*i*n*foin /ffcrcroniCFin/T .ATlofTl 
X lU LcLU/ \J 1 g«J 111 Mil/ JUCllg 111 


NOV7a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9BY90 


KIAA1670 protein - Homo 
sapiens (Human), 598 aa 
(fragment). 


1..134 
18..151 


133/134 (99%) 
133/134(99%) 


2e-73 


Q92523 


Carnitine 

O-palmitoyltransferase I, 
mitochondrial muscle isoform 
(EC 2.3.1.21) (CPTI) 
(CPTI-M) (Carnitine 
palmitoyltransferase I like 
protein) - Homo sapiens 
(Human), 772 aa. 


1..134 
1..134 


133/134(99%) 
133/134(99%) 


2e-73 


Q924X2 


Muscle-type carnitine 
palmitoyltransferase I (EC 
2.3.1.21) (Hypothetical 88.2 
kDa protein) - Mus musculus 
(Mouse), 772 aa. 


1..149 
1..147 


118/149(79%) 
128/149 (85%) 


le-63 


035287 


Carnitine palmitoyltransferase 
I - Mus musculus (Mouse), 
772 aa. 


1..149 
1..147 


118/149(79%) 
128/149 (85%) 


le-63 


Q9QYP4 


Muscle type carnitine 
palmitoyltransferase I - Mus 
musculus (Mouse), 772 aa. 


1..149 
1..147 


118/149(79%) 
128/149 (85%) 


le-63 
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Example 8. 

The NOV8 clone was analyzed, and the nucleotide and encoded polypeptide 
10 sequences are shown in Table 8A. 



Table 8A. NOV< 


8 Sequence Analysis 




SEQIDNO: 37 


1122 bp ] 


NOV8a, 
CG142797-01 
DNA Sequence 


CTAGAT TTTTGAAACATGAATCCTTCACTCC TC CTGGCTGCC TTTTTCCTGGGAATTGCCTCAGC TGC 

TCTAACACGTGACCACAGTCTAGACGCACAATGGACCAAGTGGAAGGCAAAGCACAAGAGATTATA 

ACATGGAGAACATGAAGATGACTGAGCAGCACAATCAGGAATACAGCCAAGGGAAACAC^ 

ATGGCCATGAACACCTTTGGAGACATGACCACTGAAGAATTCAGGCAGGTGATGAATGGTTTTCAATA 

CC^GAAGCAC^GGAACGGGAAACAGTTCCAGGAACGCCTGCTTCTTGAGATCCCC^CmTCTGTGGACT 
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* 


GGAGAGAGATUVGGCTACATGACTCCTGTGAAGGATCA^C^ 

GCAACTGGTGCTCTGGAAGGGCAGATGTTCTGGAAAACAGGCAAACTTATCTCACTGAATGAGCAGAA 

TCTGGTAGACTGCTCTGGGCCTCAAGGCAATGAGGGCTGCAATGGTGGCTTCATGGATAATCCCTTCC 

GGTATGTTCAGGAGAACGGAGGCCTGGACTCTGAGGCATCCTATCCATATGAAAAAACCTGTAGGTAC 

AATCCCAAGTATTCTGCTGCTAATGACACTGGCTTTGTGGACATCCCTTCACAGGAGAAGGACCTOGC 

GAAGGCAGTGGCAACTGTGGGGCCC^TCTCTGTTGCTGCTGGTGCAAGCCATGTCTCCTTCGAGTTC 

ATAAAAAAGGTATTTATTTTGAGCCACGCTGTQACCCCGAAGGTCTGGATCATGCTATGCTGCTGGTT 

GGCTACAGCTATGAAGGAGCAGACTCAGATAACAATAAATATTGGCTGGTGAAGAACAGGTATGGTAA 

AAACTGGGGCATGGATGGCTACATAAAGATGGCCAAAGACCGGAGGAACAACTGTGGAATTGCCACAG 

rAGCCAGrTACCCCACTGTGTGAGCTGATGGATGGTGATGAGGAAGAACTTGACTGAGGATGGCACAT 


CCAAAGGAGGAATTTATCTTCAATCTACCAGCCCCTGCTGTGTGGAATGCGCACTTCAATCATTGAAG 


ATCCAAGTGTGATTGGAATTCTGATATTTTCACA 




ORF Start: ATG at 16 j 


ORFStop: TGA at 973 



NOV8a, 



ISEQIDNalT 



]319 aa 



MWat35984.2kD 



MNPSLLI^ 

, _ r ^ JFGDMTTEEFRQVMNGFQYQKHimGKQFQERLLL^ 

CG142797-01 | E gqmfwktgklislneqi^^ 

Protein Jaandtgfvdi psqekdlakavatvgpi svaagashvsfqfykkgi yfeprcdpegldhamllvgys ye 

o |gADSDNNKYWLVKNRYGKNWGMIX3YIKM^ 

Sequence 
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Further analysis of the NOV8a protein yielded the following properties shown in 
Table 8B. 



Table 8B. Protein Sequence Properties NOV8a 


PSort analysis: 


0.8200 probability located in endoplasmic reticulum (membrane); 0.5140 
probability located in plasma membrane; 0.2423 probability located in 
microbody (peroxisome); 0.1000 probability located in endoplasmic reticulum 
(lumen) 


SignalP analysis: 


Cleavage site between residues 18 and 19 



A search of the NOV8a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 8C. 



Table 8C. Geneseq Results for NOV8a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV8a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 
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AAU98883 


Human protease PRTS1 - 
Homo sapiens, 334 aa. 
[WO200238744-A2, 
16-MAY-2002] 


■ IP 

1..319 

1..334 


310/334 (92%) 


1 .-st ..L j;ti ./ .. 
e-180 


ABG61771 


Novel cathepsin-L 
precursor-like protein - Homo 
sapiens, 333 aa. 
[WO200229058-A2, 
ll-APR-2002] 


1..319 
1..333 


288/333 (86%) 
300/333 (89%) 


e-171 


ABG66692 


Human novel polypeptide #27 
- Homo sapiens, 333 aa. 
[WO200244340-A2, 
06-JTJN-2002] 


1..319 
1..333 


260/333 (78%) 
278/333 (83%) 


e-154 


ABG66714 


Human novel polypeptide #49 
- Homo sapiens, 333 aa. 
[WO200244340-A2, 
06-JUN-2002] 


1..319 
1..333 


259/333 (77%) 
277/333 (82%) 


e-154 


AJBB77396 


Human cathepsin L - Homo 
sapiens, 333 aa. 
[DE10050274-A1, 
18-APR-2002] 


1..319 
1..333 


249/333 (74%) 
274/333 (81%) 


e-147 


In a BLAST search of public sequence datbases, the NOV8a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 8D. 


Table 8D. Public BLASTP Results for NOV8a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV8a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P07711 


Cathepsin L precursor (EC 
3.4.22.15) (Major excreted 
protein) (MEP) - Homo 
sapiens (Human), 333 aa. 


1..319 
1..333 


249/333 (74%) 
274/333 (81%) 


e-147 


Q9GKL8 


Cysteine protease - 
Cercopithecus aethiops (Green 
monkey) (Grivet), 333 aa. 


1.319 
1..333 


247/333 (74%) 
273/333 (81%) 


e-146 


Q9GL24 


Cathepsin L (EC 3.4.22.15) - 
Canis familiaris (Dog), 333 aa. 


1..319 
1..333 


236/334(70%) 
265/334(78%) 


e-138 


Q28944 


Cathepsin L precursor (EC 
3.4.22.15) - Sus scrofa (Pig), 
334 aa. 


1..319 
1..334 


228/334 (68%) 
263/334 (78%) 


e-135 
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P25975 


Cathepsin L precursor (EC 
3.4.22.15)- Bos taunis 
(Bovine), 334 aa. 


p 

1..319 
1..334 


C 1/ usug. 

222/334 (66%) 
261/334 (77%) 


^» jL -la* jr . 
e-133 



PFam analysis predicts that the NOV8a protein contains the domains shown in the 
Table 8E. 
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Table 8E. Domain Analysis of NO V8a 


Pf am Domain 


NOV8a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Peptidase_Cl 


103.318 


123/337 (36%) 
194/337 (58%) 


2.4e-lll 



Example 9. 

10 The NOV9 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 9A. 



Table 9A. NOV9 Sequence Analysis 




SEQIDNO:39 1 1740 bp j 


NOV9a, 
CG143216-01 
DNA Sequence 


CACGAGGCCGCTAACGGTCCGGCGCCCCTCGGCGTCCGCGCGCCCCCAGCCTGGCGGACGAGCCCGGC 


GGCGGAGATGGGGGCGACGGGGGCGGCGGAGCCGCTGCAATCCGTGCTGTGGGTGAAGCAGCAGCGCT 

GCGCCGTGAGCCTGGAGCCCGCGCGGGCTCTGCTGCGCTGGTGGCGGAGCCCGGGGCCCGGAGCCGGC 

GCCCCCGGTGCTGATGCCTGCTCTGTGCCTGTATCTGAGATCATCGCCGTTGAGGAAACAGACGTTCA 

CGGGAAAC ATC AAGGCAGTGGAAAATGGCAGAAAATGGAAAAGCCT TACGC TTTTACAGTTC ACTGTG 

TAAAGAGAGCACGACGGCACCGCTGGAAGTGGGCGCAGG0X5ACTTTCTGGTGTCCAGAGGAGCAGCTG 

TGTCACTTGTGGCTGCAGACCCTGCGGGAGATGCTGGAGAAGCTGACGTCCAGACCAAAGCATTTACT 

GGTATTTATCAACCCGTTTGGAGGAAAAGGACAAGGCAAGCGGATATATGAAAGAAAAGTGGCACCAC 

TGTTCACCTTAGCCTCCATCACCACTGACATCATCGTTACTGAACATGCTAATCAGGCCAAGGAGACT 

CTGTATGAGATTAACATAGACAAATACGACGGCATCGTCTGTGTCGGCGGAGATGGTATGTTCAGCGA 

GGTGC TGCACGGTC TGATTGGGAGGACGCAGAGGAGCGC CGGGGTCGACCAGAACCACCCCCGGGCTG 

TGCTGGTCCCCAGTAGCCTCCGGATTGGAATCATTCCCGCAGGGTCAACGGACTGCGTGTGTTACTCC 

ACCGTGGGCACCAGCGACGCAGAAAdCTCGGCGCTGCATATCGTTGTTGGGGACTCGCTGGCCATGGA 

TGTGTCCTCAGTCCACCACAACAGCACACTCCTTCGCTACTCCGTGTCCCTGCTGGGCTACGGCTTCT 

ACGGGGACATCATCAAGGACAGTGAGAAGAAACGGTGGTTGGGTCTTGCCAGATACGACTTTTCAGGT 

TTAAAGACCTTCCTCTCCCACCACTGCTATGAAGGGACAGTGTCCTTCCTCCCTGCACAACACACGGT 

GGGATCTCCAAGGGATAGGAAGCCCTGCCGGGCAGGATGCTTTGTTTGCAGGCAAAGCAAGCAGCAGC 

TGGAGGAGGAGCAGAAGAAAGCACTGTATGGTTTGGAAGCTGCGGAGGACGTGGAGGAGTGGCAAGTC 

GTCTGTGGGAAGTTTCTGGCCATCAATGCCACAAACATGTCCTGTGCTTGTCGCCGGAGCCCCAGGGG 

CCTCTCCCCGGCTGCCCACTTGGGAGACGGGTCTTCTGACCTCATCCTCATCCGGAAATGCTCCAGGT 

TCAATTTTCTGAGATTTCTCATCAGGCACACCAACCAGCAGGACCAGTTTGACTTCACTTTTGTTGAA 

GTTTATCGCGTCAAGAAATTCCAGTTTACGTCGAAGCACATGGAGGATGAGGACAGCGACCTCAAGGA 

GGGGGGGAAGAAGCGCTTTGGGCACATTTGCAGCAGCCACCCCTCCTGCTGCTGCACCGTCTCCAACA 

GCTCCTGGAACTGCGACGGGGAGGTCCTGCACAGCCCTGCCATCGAGGTCAGAGTCCACTGCCAGCTG 

GTTCGACTCTTTGGACGAGGAATTGAAGAGAATCCGAAGCG&GACT 

CCTGCTCTC GAACTGGGAAAGTGTGAAAACTATTTAAGAT 




ORF Start: ATG at 76 | 


ORF Stop: TGA at 1687 
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SEQIDNO:40 ~\S37 a*. jMW at 59976.9kD 


NOV9a, 
CG143216-01 
Protein 
Sequence 


MGATGAABPLQSVLWVKQQRCAVSLEPARALLRWWRSPGPGAGAPGJ^ACSTOVSEIIAVEETDVHGK 
HQGSGKWQKME^YAFTVHCTKRARRHRWKWAQVTFWCPEEQLCm 

INPFGGKGQGKRI YERKVAPLFTLAS ITTDI IVTEHANQAKETLYEINIDKYDGIVCVGGDGMFSEVli 
HGLIGRTQRSAGVDQNHPRAVLVPSSLRIGIIPAGSTDCVCYSTVGTSDAETSALHIWGDSLAMDVS 
S VHHNSTLLRYS VS LLGYGFYGDI IKDSEKKRWLGLARYDF SGLKTFL SHHC YEGTVSFLPAQHTVG S 
PRDRKPCRAGCFVCRQSKQQLEEEQKKALYGLEAAEDVEEWQWCGKFLAINATN^CACRRSPRGLS 
PAAHLGDGSSDLILIRKCSRFOTLRFLIRHTNQQDQFDFT 

KKRFGHICSSHPSCCCTVSNSSWNCTCEVLHSPAIEVRVHCQIiVRLFARGIEENPKPDSHS 



Further analysis of the NOV9a protein yielded the following properties shown in 
Table 9B. 



Table 9B. Protein Sequence Properties NOV9a 


PSort analysis: 


0.5121 probability located in microbody (peroxisome); 03000 probability 
located in nucleus; 0. 1000 probability located in mitochondrial matrix space; 
0. 1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV9a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 9C. 



Table 9C Geneseq Results for NOV9a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV9a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


ABB07857 


Human sphingosine 
kinase-like protein - Homo 
sapiens, 562 aa. 
[WO200228906-A2, 
ll-APR-2002] 


L.537 
26..562 


537/537 (100%) 
537/537 (100%) 


0.0 


ABB07856 


Human sphingosine 
kinase-like protein - Homo 
sapiens, 537 aa. 
[WO200228906-A2, 
ll-APR-2002] 


1..537 
L.537 


537/537 (100%) 
537/537 (100%) 


0.0 
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AAM49115 


Human ceramide kinase 
hCERKl - Homo sapiens, 
537 aa. [WO200196575-A1, 
20-DEC-2001] 


ip 

1..537 
1..537 


535/537 (99%) 
536/537 (99%) 


*m(1m <U«il» .|P « * 

0.0 


AAY96059 


Human sphingosine kinase C 
- Homo sapiens, 460 aa. 
[WO200052173-A2, 
O8-SEP-2000] 


78..537 
1..460 


458/460 (99%) 
459/460 (99%) 


0.0 


AAE07884 


Human sphingosine kinase 
(SphK) protein #2 - Homo 
sapiens, 471 aa. 
[WO200160990-A2, 
23-AUG-2001] 


78..537 
1..471 


459/471 (97%) 
460/471 (97%) 


0.0 


In a BLAST search of public sequence datbases, the NOV9a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 9D. 


Table 9D. Public BLASTP Results for NOV9a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV9a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q8TCT0 


Putative lipid kinase - Homo 
sapiens (Human), 537 aa. 


1..537 
1..537 


537/537 (100%) 
537/537 (100%) 


0.0 


Q9BYB3 


KIAA1646 protein - Homo 
sapiens (Human), 481 aa 
(fragment). 


57..537 
1..481 


481/481(100%) 
481/481 (100%) 


0.0 


BAC01155 


Ceramide kinases - Mus j 
musculus (Mouse), 531 aa. 


1..529 
1..529 


450/529 (85%) 
483/529 (91%) 


0.0 


Q9UGE5 


DA59H18.2 (Novel protein | 
similar to human, mouse, 
yeast, worm and plant 
(Predicted) proteins) - Homo 
sapiens (Human), 326 aa 
(fragment). 


130..444 
1..326 


314/326(96%) 
315/326 (96%) 


0.0 


Q9TZI1 


T10B11.2 protein - 
Caenorhabditis elegans, 549 
aa. 


79..525 
115.-526 


141/458 (30%) 
230/458 (49%) 


le-52 



PFam analysis predicts that the NOV9a protein contains the domains shown in the 
10 Table 9E. 
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Table 9E. Domain Analysis of NOV9a 


Pfam Domain 


NOV9a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


PH 


32..124 


9/93 (10%) 
64/93 (69%) 


0.38 


DAGKc 


132..278 


32/165 (19%) 
100/165 (61%) 


0.00015 



Example 10. 

The NOV10 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 10A. 



Table 10A. NOV10 Sequence Analysis 




SEQIDNO:41 |772bp j 


NOVlOa, 
CG143787-01 
DNA Sequence 


AACTGGAGACCACAACTTCAT6CTGCGTGGGATCTCCCAACTACCTGCAGTGGCCACCATGTCTTGG 
GTCCTGCTGCCTGTACTTTGGCTCATTGTTCAAACTCAAGCAATAGCCA. 

TAACGCTCCATGAAATAGTTTGTCCTAAAAAACTTCACATTTTACACAAAAGAGAGATCAAG7\AC^ 
C CAGACAGAAAAGC ATGGCAAAGAGGAAAGGTATGAAC C TGAAGT TC AAT ATCAG ATGATCT TAAAT 
GGAGAAGAAATCATTC TCTCCCTAC AAAAAACCAAGC ACCTC CTGGGGCCAGACTACACTGAAACAT 
TGTACTCACCCAGAGGAGAGGAAATTACCACGAAACCTGAGAACATGGAACACTGTTACTATAAAGG 
AAACATCCTAAATGAAAAGAATTCTGTTGCCAGCATCAGTACTTGTGACGGGTTGAGAGGATACTTC 
ACACATCATCACCAAAGATACCTTTTATCTCAGAAACCAAAGTGCC 

CAAATATAATGACAACACCAGTGTGTGGGAACCACCTTCTAGAAGTGGGAGAAGACTGTGATTGTGG 
CTCTCTTAAGGAGTGTACCAATCTCTGCTGTG7\AGCCCTAACGTGTA^CTGAAGCCTGG2\ACTGAT 
TGCGGAC5GAGATGCTCCAAACCATACCACAGAGTGAATCCAA7^AGTCTGCTTCACTGAGATGCTACC 


TTGCCAGGACAAGAACCAAGAACTCTAACTGTCCC 




ORF Start: ATG at 20 j |ORF Stop: TGA at 704 





SEQIDNO:42 


228 aa [MW at 25718.4kD 


NOVlOa, 
CG143787-01 
Protein Sequence 


MLRGISQLPAVATMSWVLLPVLWLIVQTQAIAIKQTPELTLHEr^CPKKLHILHKREIKKNQTEKHG 
KEERYEPEVQYQMILNGEEIILSLQKTKHIjI^PDYTETLYSPRGEE 

nsvasistcixslrgyfthhhqryllsqkpkcllqapiptnimttpvcgnhll'evgedcdcgslkect 

NLCCEALTCKLKPGTDCGGDAPNHTTE 





SEQIDNO:43 


706 bp 


NOVlOb, 
278889162 DNA 
Sequence 


CACCGGATCCACCATGCTGCGTGGGATCTCCCAACTACCTGCAGTGGCCACCATGTCTTGGGTCCTG 
C TG C C TGTAC TTTG GC TCATTGTT CAAAC TCAAGCAAT AG C C ATAAAGC AAACAC C TGAATT AACGC 
TCCATGAAATAGTTTGTCCTAAAAAACTTCACATTTTAC^ 

AGAAAAGCATGGCAAAGAGGAAAGGTATGAACCTGAAGTTCAATATCAGATGATCTTAAATGGAGAA 
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GAAATCATTCTCTCCCTACAAAAAACCAAG 

CACCCAGAGGAGAGGAAATTACCACGAAACCTGAGAACATGGAACACTGTTACTATAAAGGAAACAT 
CCTAAATGAAAAGAATTCTGTTGCCAGt^TCAGTACTTGTGACGGGTTGAGAGGATACTTCACACAT 
C ATCACGAAAGATACCTTTTATC TCAGTUyvCCAAAGTGCCTGCTGC AAGCACCTATTCC TACAAATA 
TAATGACAACACCAGTGTGTGGGAACCACCTTCTAGAAGTGGGAGAAGACTGTGATTGTGGCTCTCT 
TAAGGAGTGTACCAATCTCTGCTGTGAAGCCCTAACGTGTAAACTGAAGCCTGGAACTGATTGCGGA 
GGAGATGC TCCAAACC ATACC AC AGAGCTCGAGGGC 




OKF Start: at 2 jORF Stop: end of sequence 



[SEQ g) NO: 44 )235 aa |MW at 263^1Kd" 



NOVlOb Itgstmlrgi sqlpavatmswvllpvlwlivqtqaiaikqtpeltlhei vc pkklhilhkreiknnqt 

0Q<?Q1 * JEKHGKE3SRYEPEVQYQMILNGEEI ILSLQKTKHLLGPDYTETLYSPRGEEITTKPENMEHCYYKGNI 

Z/SOWlOi JLNEKNSVASISTCDGLRGYFTHHHQRYLLSQKPKCLLQAPIPTO 

Protein Sequence IkectnI/Ccealtcklkpgtdcggdapnhtteleg 





SEQ ID NO: 45 


|l!8bp I 


NOVIOc, 
278689868 DNA 
Sequence 


CACCGGATCCGAAGTGGGAGAAGACTGTGATTGTGGCTCTCTTAAGGAGTGTACCAATCTCTGCTGT 
GAAGCCCTAACGTGTAAACTGAAGCCTGGAACTGATTGCGGACTCGAGGGC 




ORFStart:at2 


|ORF Stop: end of sequence 





SEQ ID NO: 46 


|39aa jMW at 3983.4kD 


NOVIOc, 

278689868 Protein Sequence 


TGSEVGEDCDCGSLKECTNLCCEALTCKLKPGTDCGLEG 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 10B. 



Table 10B. Comparison of NOVlOa against NOVlOb and NOVIOc. 


Protein Sequence 


NOVlOa Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOVlOb 


1..228 
5..232 


228/228 (100%) 
228/228 (100%) 


NOVIOc 


187..219 
4..36 


33/33 (100%) 
33/33 (100%) 
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Further analysis of the NOVlOa protein yielded the following properties shown in 
Table IOC. 



5 



Table 10C- Protein Sequence Properties NOVlOa 


PSort analysis: 


0.8200 probability located in outside; 0.1900 probability located in lysosome 
(lumen); 0.1000 probability located in endoplasmic reticulum (membrane); 
0. 1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 33 and 34 



A search of the NOVlOa protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
10 several homologous proteins shown in Table 10D. 



Table 10D. Geneseq Results for NOVlOa 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOVlOa 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAW75769 


Human metalloproteinase 
BS10.55 - Homo sapiens, 
470 aa. [W09839421-A2, 
U-SEP-1998] 


1..157 
1..157 


157/157 (100%) 
157/157 (100%) 


7e-90 


AAW28509 


Product of clone J5 - Homo 
sapiens, 470 aa. 
[WO9707198-A2, 
27-FEB-1997] 


1..157 
1..157 


157/157 (100%) 
157/157 (100%) 


7e-90 


AAB53240 


Human colon cancer antigen 
protein sequence SEQ ID 
NO:780 - Homo sapiens, 110 
aa. [WO200055351-A1, 
21-SEP-2000] 


153..228 
35..110 


73/76 (96%) 
74/76(97%) 


7e-41 


ABB 11929 


Human eMDC II protein 
homologue, SEQ ID 
NO:2299 - Homo sapiens, 
788 aa. [WO200157188-A2, 
09-AUG-2001] 


18..159 
18..153 


71/142(50%) 
99/142(69%) 


2e-32 


AAW90865 


Human ADAM protein #4 - 
Homo sapiens, 775 aa. 
[WO200014227-A1, 
16-MAR-2000] 


18..159 
5..140 


71/142(50%) 
99/142(69%) 


2e-32 
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In a BLAST search of public sequence datbases, the NOVlOa protein was found to 
have homology to the proteins shown in the BLASTP data in Table 10E. 
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Table 10E. Public BLASTP Results for NOVlOa 


X JTlrieill 

Accession 
Number 


Protein/Orgauism/Length 


NOVlOa 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


015204 


Disintegrin-protease - Homo 
sapiens (Human), 470 aa. 


1..157 
1..157 


157/157(100%). 
157/157 (100%) 


2e-89 


Q9R0X2 


Disintegrin metalloprotease 
precursor - Mus musculus 
(Mouse), 467 aa. 


1..157 
1..157 


104/157 (66%) 
124/157 (78%) 


8e-56 


Q9XSL6 


ADAM 28 precursor (EC 
3.4.24.-) (A disintegrin and 
metalloproteinase domain 28) 
( eMDC ED - Macaca 
fascicularis (Crab eating 
macaque) (Cynomolgus 
monkey), 776 aa. 


14..159 

1..141 
• 


70/146 (47%) 
101/146 (68%) 


le-32 


E1262181 


SEQUENCE 3 FROM 
PATENT WO9709430- 
unidentified, 530 aa. 


18..159 
5..140 


71/142 (50%) 
99/142 (69%) 


5e-32 


Q9UKQ2 


ADAM 28 precursor (EC 
3.4.24.-) (A disintegrin and 
metalloproteinase domain 28) 
(Metalloproteinase-like, 
disintegrin-like, and cysteine- 
rich protein-L) (MDC-L) 
(eMDC IT) (ADAM23) - 
Homo sapiens (Human), 775 
aa. 


18.. 159 
5.. 140 


71/142 (50%) 
99/142 (69%) 


5e-32 



PFam analysis predicts that the NOVlOa protein contains the domains shown in the 
10 Table 10F. 



Table 10F. Domain Analysis of NOVlOa 
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Pfam Domain 


NOVlOa Match 
Region 


Identities/ 
Similarities 
lor the Matcnea 
Region 


Expect Value 


Pep_M12B_propep 


90..201 


32/119 (27%) 
79/119(66%) 


1.8e-20 


disintegrin 


187..219 


20/33 (61%) 
26/33 (79%) 


4e-14 



Example 11. 

The NOV1 1 clone was analyzed, and the nucleotide and encoded polypeptide 
5 sequences are shown in Table 1 1 A. 



Table 11A. NOV11 Sequence Analysis 




SEQIDNO:47 |484bp j 


NOVlla, 
CG1441 12-01 
DNA Sequence 


ACTGGGTCCGAATC AGTAGGTGACCCCGC CCC TGGATTCTGGAAGACCTC ACC ATGGGACGCCCCCG 


ACCTCGTGCGGCC AAGACGTGGATGTTCC TGCTCTTGC TGGGGGGAGCCTGGGC AGG AAATACAC AG 
TACGCCTGGGAGACCACAGCCTACAGAATAAAGATGGCCCAGAAGTGCAGTCCCCGAGAGAATTTTC 
CTGACACTCTCAACTGTGCAGAAGTAAAAATCTTTCCCCAGAAGAAGTGTGAGGATGCTTACCCGGG 
GCAGATCACAGATGGCATGGTCTGTGCAGGCAGCAGCAAAGGGGCTGACACGTGCCAGGGCGATTCT 
GGAGGCCCCCTGGTGTGTGATGGTGCACTC(^GGGCATCACATCCTGGGGCTCAGACCCCTGTGGGA 
GGTCCGACAAACCTGGCGTCTATACCAACATCTGCCGCTACCTGGACTGGATCAAGAAGATCATAGG 
CAGCAAGGGCTGATT 




ORF Start: ATG at 54 


|ORFStop:TGAat480 
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jSEQIDNO: 48 



142 aa 



MW at 15404.5kD 



NOV1 la JMGRPRPRAAKTWMFLLLUSGAWAGM^^^ 

JDAYPGQITIXSMVCAGSSKGADTCQGDSGGPIiVCDGALQGITSWGSDPCGRSDKPGVYTNICRYLDWI 

CG1441 12-01 Jkkiigskg 
Protein Sequence 
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SEQIDNO:49 j288 bp | 


NOVllb, 
CG1441 12-04 
DNA Sequence 


CCCCGCGCCTGGATTCTGGAAGACCTCACCATGGGACGCCCCCGACCTCGTGCGGCCAAGACGTGGA 


TGTTC CTGCTCTTGC TGGGGGGAGCCTGGGCAGGGCAGGGCGATTC TGGAGGCCCCCTGGTGTGTGA 
TGGTGCACTCCAGGGCATCACATCCTGGGGCTCAGACCCCTGTGGGAGGteCGACAAACCTGGCGTC 
TATACCAACATCTGCCGCTACCTGGACTGGATCAAGAAGATCATAGGCAGCAAGGGCTGATTCTAGG 
ATAAGCACTAGATCTCCCTT 




ORF Start: ATG at 31 | |ORF Stop: TGA at 259 
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SEQ ID NO: 50 76 aa jMW at 81 10.3kD 


NOVllb, 
CG1441 12-04 
Protein Sequence 


MGRPRPRAAKTWMFLLLIX3GAWAGQGDSGGPLVCDGALQGITSWGSDPCGRSDKPGVYTNICRYLDW 
IKKIIGSKG 






SEQ ID NO: 51 445 bp J 


NOVllc, 
255501898 DNA 
Sequence 


CACCAAGCTTATGGGACGCCCCC^CCTCGTGCGGCCAAGACGTGGATGTTCCTGCTCTTGCTGGGG 
GGAGCCTGGGC^GGAAATACACAGTACGCCTGGGAGACCACAGCCTACAGAATAAAGATGGCCCAGA 
AGTGCAGTCCCCGAGAGAATTTTCCTGACACTCTCAACTGTGCAGAAGTAAAAATCTTTCCCCAGAA 
GAAGTGTGAGGATGCTTACCCGGGGCAGATCACAGATGGCATGGTCTGTGCAGGCAGCAGCAAAGGG 
GCTGACACGTGCCAGGGCGATTCTGGAGGCCCCCTGGTGTGTGATGGTGCACTCCAGGGCATCACAT 
CCTGGGGCTCAGACCCCTGTGGGAGGTCCGACAAACCTGGCGTCTATACCAACATCTGCCGCTACCT 
GGACTGGATCAAGAAGATCATAGGCAGCAAGGGCCTCGAGGGC 




ORF Start: at 2 jORF Stop: end of sequence 





SEQ ID NO: 52 jl48 aa j]tfW at 16046.2kD 


NOVllc, 
255501898 
Protein Sequence 


TKLMGRPRPRAAKTWMFLLLLGGAWAGNTQYAWETTAra 

KCEDAYPGQITDGMVCAGSSKGADTCQGDSGGPLVCDGALQGITSWGSDPCGRSDKPGVYTNICRYL 
DWIKK 1 1 GSKGLEG 






SEQ ID NO: 53 |358bp | 


NOVlld, 
255612524 DNA 
Sequence 


CACCAAGCTTGGAAATACACAGTACGCCTGGGAGACCACAGCCTACAGAATAAAGATGGCCCAGAAG 
TGCAGTCCCCGAGAGAATTTTCCTGACACTCTCAACTGTGCAGAAGTAAAAATCTTTCCCCAGAAGA 
AGTGTG AGGATGC TTACC CGG G GC AG ATC AC AGATGGC ATGGTC TGTGCAGGCAGCAGC AAAGGGGC 

TGGGGCTCAGACCCCTGTGGGAGGTCCGACAAACCTGGCGTCTATACCAACATCTGCCGCTACCTGG 
ACTGGATCAAGAAGCTCGAGGGC 




ORF Start: at 2 jORF Stop: end of sequence 





SEQ ID NO: 54 


119 aa |MW at 12908.4kD 


NOVlld, 
255612524 
Protein Sequence 


TKLGNTQYAWETTAYRIKMAQKCSPRENFPDTLNCAEVKIFPQKKCEDAYPGQ 
DTCQGDSGGPLVCDGALQGITSWGSDPCGRSDKPGVYTNICTYLDWIKKLEG 
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SEQ ID NO: 55 


307 bp 


NOVlle, 
255612566 DNA 
Sequence 


CACCAAGCTTCAGAAGTGCAGTCCCCGAGAGAATTTTCCTGACACTCTCAACTGTGCAGAAGTAAAA 
ATCTTTCCCCAGAAGAAGTGTGAGGATGCTTACCCGGGGCAGATCACAGATGGCATGGTCTGTGCAG 
GCAGCAGCAAAGGGGCTGACACGTGCCAGGGCGATTCTGGAGGCCCCCTGGTGTGTGATGGTGCACT 
C CAGGGCATC ACATCCTGGGGC TCAG ACCCCTGTGGGAGGTCCGACAAAC C TGGCGTCTATACC AAC 
ATCTGCCGC TACC TGGAC TGGATCAAGAAG CTCGAGGGC 




ORF Start: at 2 


ORF Stop: end of sequence 





SEQ ID NO: 56 |l02 aa jMW at 10922.2kD 


NOVlle, 
255612566 
Protein Sequence 


TKLQKCSPRENFPDTLNCAEVKIFPQKKCEDAYPGQITDGMVCAGSSKGADTCQGDSGGPLVCDGAL 
QGI TSWGSDPCGRSDKPGVYTNI CRYLDWIKKLEG 





SEQ ID NO: 57 


178 bp 


NOVllf, 
306434072 DNA 
Sequence 


CACCGGATCCGGGCAGGGCGATTCTGGAGGCCCCCTGGTGTGTGATGGTGCACTCCAGGGCATCACA 
TCCTGGGGCTCAGACCCCTGTGGGAGGTCCGACAAACCTGGCGTCTATACCAACATCTGCCGCTACC 
TGGAC TG G ATC AAG AAG ATC ATAGGCAGC AAGGGC C TC GAG GGC 




ORF Start: at 2 


ORF Stop: end of sequence 





SEQ ID NO: 58 |59 aa |MW at 6072.7kD 


NOVllf, 
306434072 
Protein Sequence 


TGSGQGD SGGPLVCDGALQGI TSWGSDPCGR S DKPGVYTNICRYLDWIKK 1 1 GS KGLEG 





SEQ ID NO: 59 


436 bp 


NOVllg, 
CG1441 12-02 
DNA Sequence 


AGTGTGCTGGAATTCGCCCTTACTGGGTCCGAATCAGTAGGTGACCCCGCCCCTGGATTCTTGAAGA 


CCTCACCATGGGACGCCCCCGACCTCGTGCGGCCAAGACGTGGATGTTCCTGCTCTTGCTGGGGGGA 
GCCTGGGCAGAGAATTTTCCTGACACTCTCAACTGTGCAGAAGTAAAAATCTTTCCCCAGAAGAAGT 
GTGAGGATGCTTACCCGGGGCAGATCACAGATGGCATGGTCTGTGCAGGCAGCAGCAAAGGGGCTGA 
CACGTGCCAGGGCGATTCTGGAGGCCCCCTGGTGTGTGATGGTGCACTCCAGGGCATCACATCCTGG 
GGCTCAGACCCCTGTGGGAGGTCCGACAAACCTGGCGTCTATACCAACATCTGCCGCTACCTGGACT 
GG ATCAAGAAGAT C ATAGGC AG C AAGGG C T GATT 




ORF Start: ATG at 75 


jORFStop: TGA at 432 
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SEQ ID NO: 60 |ll9 aa MW at 12718.4kD 


NOVllg, 
CG1441 12-02 
Protein Sequence 


MGRPRPRAAKTWMFLLLLGGAWAEOTPDTI^CAEVKIFPQKKCEDAYPGQITDGMVCAGSSKGADTC 
QGDSGGPLVCDGALQGITSWGSDPCGRSDKPGVYTNICRYI>DWIKKIIGSKG 
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SEQ ID NO: 61 


845 bp I 




NOVllh, 
CG144112-03 
DNA Sequence 


CGCCCTTAC TGGGTCCGAATCAGT AGGTG ACCC CGCCCC TGGATTCTGGAAGACCTC ACC ATGGGAC 


GCCCCCGACCTCGTGCGGCCAAGACGTGGATGTTCCTGCTCTTGCTGGGGGGAGCCTGGGCAGGACA 
CTCCAGGGCACAGGAGGACAAGGTGCTGGGGGGTCATGAGTGCCAACCCCATTCGCAGCCTTGGCAG 
GCGGCCTTGTTCCAGGGCCAGCAACTACTCTGTGGCGGTGTCCTTGTAGGTGGCAACTGGGTCCTTA 
CAGCTGCCCACTGTAAAAAACCGAAATACACAGTACGCCTGGGAGACCACAGCCTACAGAATAAAGA 
TGGCCCAGAGCAAGAAATACCTGTGGTTCAGTCCATCCCACACCCCTGCTACAACAGCAGCGATGTG 
GAGGACCACAACCATGATCTGATGCTTCTTCAACTGCGTGACCAGGCATCCCTGGGGTCCAAAGTGA 
AGCCCATCAGCCTGGCAGATCATTGCACCCAGCCTGGCCAGAAGTGCACCGTCTCAGGCTGGGGCAC 
TGTCACCAGTCCCCGAGAGAATTTTCCTGACACTCTCAACTGTGCAGAAGTAAAAATCTTTCCCCAG 
AAGAAGTGTGAGGATGCTTACCCGGGGCAGATCACAGATGGCATGGTCTGTGCAGGCAGCAGCAAAG 
GGGCTGACACGTGCCAGGGCGATTCTGGAGGCCCCCTGGTGTGTGATGGTGCACTCCAGGGCATCAC 
ATCCTGGGGCTCAGAC CCCTGTGGGAGGTC CGACAAACCTGGCGTC TATACCAACATC TGC CGC TAC 
CTGGACTGGATCAAGAAGATCATAGGCAGCAAGGGCTGATT 




ORF Start: ATG at 61 J 


ORF Stop: TGAat 841 
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SEQ ID NO : 62 


260 aa |MW at 28047. 6kD 


NOVllh, 
CG1441 12-03 
Protein Sequence 


MGRPRPRAAKTO^LLIJ^GGAWAGHSRAQEDKVLGGHEC^ 

VLTAAHCKKPKYTVRLGDHSLQNKDGPEQE I PWQS IPHPCYNSSDVEDHNHDLMLLQLRDQASLGS 
KVKPISLADHCTQPGQKCTVSGWGTVTSPRENFPDTLNCAEVKIFPQKKCEDAYPGQITDGMVCAGS 
SKGADTCQGDSGGPLVCDGALQGITSWGSDPCGRSDKPGVYTNICRYLDWIKKIIGSKG 
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Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 1 IB. 



Table 11B. Comparison of NOVlla against NOVllb through NOVllh. 


Protein Sequence 


NOVlla Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOVllb 


97-142 
31..76 


46/46 (100%) 
46/46(100%) 


NOVllc 


1..142 
4.. 145 


142/142 (100%) 
142/142 (100%) 


NOVlld 


24.. 139 
4.. 119 


114/116(98%) 
115/116(98%) 



144 



WO 03/029424 




PCT/US02/31373 


NOVlle 


41.. 139 
4.. 102 


9^99^9^) " 
98/99 (98%) 


NOVllf 


91..142 
5..56 


52/52 (100%) 
52/52(100%) 


NOVllg 


1..142 
1..119 


119/142 (83%) 
119/142(83%) 


NOVllh 


44.. 142 
162..260 


99/99 (100%) 
99/99 (100%) 



Further analysis of the NOVlla protein yielded the following properties shown in 
Table 11C. 
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Table 11C. Protein Sequence Properties NOVlla 


PSort analysis: 


0.3700 probability located in outside; 0.1000 probability located in 
endoplasmic reticulum (membrane); 0. 1 000 probability located in 
endoplasmic reticulum (lumen); 0.1000 probability located in lysosome 
(lumen) 


SignalP analysis: 


Cleavage site between residues 24 and 25 



A search of the NOVlla protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 1 ID. 



145 



WO 03/029424 



PCT/US02/31373 



Table 11D. Geneseq Results for NOVlla 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOVlla 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABP41332 


Human ovarian antigen 
HCOQP78, SEQ ID NO:2464 
- Homo sapiens, 315 aa. 
[WO200200677-A1, 
03-JAN-2002] 


44.. 142 
217..315 


99/99(100%) 
99/99 (100%) 


3e-57 


AAU81959 


Human PR0322 - Homo 
sapiens, 260 aa. 
[WO200109327-A2, 
08-FEB-2001] 


44.. 142 
162..260 


99/99 (100%) 
99/99(100%) 


3e-57 


ABB84852 


Human PR0322 protein 
sequence SEQ ID NO:72 - 
Homo sapiens, 260 aa. 
[WO200200690-A2, 
03-JAN-2002] 


44.. 142 
162..260 


99/99 (100%) 
99/99 (100%) 


3e-57 


ABB95458 


Human angiogenesis related 
protein PR0322 SEQ ID NO: 
72 - Homo sapiens, 260 aa. 
[WO200208284-A2, 
31-JAN-2002] 


44.. 142 
162..260 


99/99 (100%) 
99/99 (100%) 


3e-57 


AAB53087 


Human 

angiogenesis-associated 
protein PR0322, SEQ ID 
NO: 127 - Homo sapiens, 260 
aa. [WO200053753-A2, 
14-SEP-2000] 


44.. 142 
162..260 


99/99(100%) 
99/99 (100%) 


3e-57 



In a BLAST search of public sequence datbases, the NOVlla protein was found to 
have homology to the proteins shown in the BLASTP data in Table HE. 



Table HE. Public BLASTP Results for NOVlla 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVlla 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9NR68 


Serine protease 
kallikrein/ovasin/neuropsin 
type 3 - Homo sapiens 
(Human), 119 aa. 


1..142 
1.119 


119/142(83%) 
119/142(83%) 


9e-66 
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060259 


Neuropsin precursor (EC 
j.4.zi.-j UNJrj (JvaiiiKrem o; 
(Ovasin) (Serine protease 
TADG-14) 
(Tumor-associated 
differentially expressed 
gene-14 protein) - Homo 
sapiens (Human), 260 aa. 


44..142 P 
162. .261) 


W(l«) Ur: ' 

yy/yy (iuu%) 




088780 


Neuropsin precursor (EC 
3.4.21.-) (NP) (Kallikrein 8) 
(Brain serine protease 1) - 
Rattus norvegicus (Rat), 260 
aa. 


38.. 141 
147..259 


80/113 (70%) 
93/113(81%) 


8e-45 


BAB92021 


Neuropsin - Mus musculus 
(Mouse), 176 aa (fragment). 


38..141 
63..175 


81/113(71%) 
92/113(80%) | 


le-44 


Q61955 


Neuropsin precursor (EC 
3.4.21.-) (NP) (Kallikrein 8) - 
Mus musculus (Mouse), 260 
aa. 


38..141 
147..259 


81/113(71%) 
92/113 (80%) 


le-44 



PFam analysis predicts that the NOV1 la protein contains the domains shown in the 
Table 1 IF. 



Table 11F. Domain Analysis of NOVlla 


Pfam Domain 


NOVlla Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value j 


trypsin 


49..134 


47/101(47%) 
76/101 (75%) 


5.5e-40 



Example 12. 

The NOV12 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 12A. 



Table 12A. NOV12 Sequence Analysis 




SEQIDNO:63 


1536bp j 


NOV12a, 
CG144497-01 
DNA Sequence 


AAGAGCCAAGCCAGCATGTCGGGGACCCGAGCCTCCAACGACCGGCCCCCCGGCGCAC?GCC;Rr«Tra 

AGCGGGGGCGGCTGCAGCAGGAGGCGGCGGCGACCGGCTCCCGCGTGACGGTGGTGCTGGGCGCGCA 

GTGGGGGGACGAGGGCAAAGGCAAGGTGGTGGACCTGCTGGCCACGGACGCCGACATCATCA 

TGCCAGGGGGGCAACAACGCCGGCCACACGGTGGTGGTGGATGGGAAAGAGTACGACTTCCACCTGC 

TGCCCAGCGGCATCATC^CACC^GGCCGTGTCCTT(^TTGGTAACGGGGTGGTCATCaVCTTGCC 

AGGCTTGO?TTGAGGAAGCAGAGAAGAATGAAAAGAAAGGTCTGAAGGACTGGGAGAAGAGGCTCATC 
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■ 1 


ATCTCTC^GAGAGCCCACCTTGTGTTTGATTTTC^ 

GCCAGGCACAAGAGGGGAAGAGTATAGGCACCACCAAGAAGGGAATCGGACCAACCTACTCTTCCAA 
AGCTGCCCGGACAGGCCTCCGCATCTGCGACCTCCTGTCAGATTTTGATGAGTTTTCCTCCAGATTC 
AAGAACCTGGCCCACCAGCACCAGTCGATGTTCCCCACCCTGGAAATAGACATTGAAGGCCAACTCA 
AAAGGCTCAAGGGCTTTGCTGAGCGGATCAGACCCATGGTCCGAGATGGTGTTTACTTTATGTATGA 
GGCAC TCCACGGCCCC CCCAAGAAGATCC TGGTGGAGGGTGCCAAC GCCGCCCTCCTCGACATTGAC 
TTCGGTACCTACCCCTTTGTGACTTCATCCAACTGCACCGTGGGCGGTGTGTGCACGGGCCTGGGCA 
TCCCCCCGCAGAACATAGGTGACGTGTATGGCGTGGTGAAAGCCTATACCACACGTGTGGGCATCGG 
GGCCTTCCCCACCGAGCAGATCAACGAGATTGGAGGCCTGCTGCAGACCCGCGGCCACGAGTGGGGA 
GTGAC CAC AGGCAGGAAGAGGCGCTGCGGC TGGCTCGACCTGATG ATTCTAAG ATATGCTC ACATGG 
TCAACGGATTCACTGCGCTGGCCCTGACGAAGCTGGACATCCTGGACGTACTGGGTGAGGTTAAAGT 
CGGTGTCTCATACAAGCTGAACGGGAAAAGGATTCCCTATTTCCCAGCTAACCAGGAGATGCTTCAG 
AAGGTCGAAGTTGAGTATGAAACGCTGCCTGGGTGGAAAGCAGACACCACAGGCGCCAGGAGGTGGG 
AGGACCTGCCCCCACAGGCCCAGAACTACATCCGCTTTGTGGAGAATCACGTGGGAGTCGCAGTCAA 
ATGGGTTG^T^TT^ 1 ^ A & r! ar: A HTPfia TG ATf!C AGCTGTTTTAGTC AC AG ACTGAGCTG ATC 
CCAACAGGCCCTGGCAGCGTCTGGACTTGTGTAAACAGCAGCAGTCACGTTCCTCGGCCGCCACAAC 


CAACACCAAAGCAGGAAAACCATTTTCTGTACTTTTATATTTCTGTTCAACCTGTTGGTTTC 


joRF Start: ATG at 16 lORF Stop: TAG at 1387 





SEQ ID NO: 64 (457 aa |MW at 50181.0kD 


NOV12a, 
CG144497-01 
Protein Sequence 


MSGTRASKTORPPGAGGVKRGRLQQEAAATCSRVTWLG^ 

NAGHTVWDGKEYDFHIiliPSGI INTKAVSFIGNGVVIHLPGLFEEAEKNEECKGI*KDWEKRLI ISDRA 
HLVFDFHQAVDGLQEVQRQAQEGKS IGTTKKGIGPTYS SKAARTGLRICDLLSDFDEFS SRFKNLAH 
QHQSMFPTLEIDIEGQLKEUjKGFAERIRPMVRDG^^ 

FVTSSNCTVGGVCTGLGIPPQNIGDVYGWKAYTTRVGIGAFPTEQINEIGGLLQTRGHEWGVTTGR 

KRRCGWLDLMILRYAHMVNGFTALALTKLDIIJDV^ 

YETLPGWKADTTGARRWEDIiPPQAQNYIRFVENHV^ 



Further analysis of the NOV12a protein yielded the following properties shown in 
Table 12B. 
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Table 12B. Protein Sequence Properties NOV12a 


PSort analysis: 


0.5946 probability located in microbody (peroxisome); 0.3000 probability 
located in nucleus; 0.2377 probability located in lysosome (lumen); 0.1000 
probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV12a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 12C. 



Table 12C. Geneseq Results for NOV12a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV12a 

Residues/ 


Identities/ 
Similarities Fnr 


Expect 
Value 



148 



WO 03/029424 



PCT/US02/31373 







g3r 

Match 
Residues 


Region 


■•'.■■"a!-"it'"an"g-;: 


AAB41627 


Human ORFXORF1391 
polypeptide sequence SEQ 
ED NO:2782 - Homo sapiens, 
314 aa. [WO200058473-A2, 
05-OCT-2000] 


144..457 
1..314 


313/314 (99%) 
314/314 (99%) 


0.0 


ABB70971 


Drosophila melanogaster 
polypeptide SEQ E> NO 
39705 - Drosophila 
melanogaster, 447 aa. 
[WO200171042-A2, 
27-SEP-2001] 


31..456 
24.446 


270/427 (63%) 
338/427 (78%) 


e-161 


AAY95049 


Candida albicans polypeptide 
sequence # 17 - Candida 
albicans, 412 aa. 
[EP982401-A2, 
01-MAR-2000] 


3S..455 
4..409 


227/425 (53%) 
306/425 (71%) 


e-130 


AAU23499 


Novel human enzyme 
polypeptide #585 - Homo 
sapiens, 209 aa. 
[WO200155301-A2, 
02-AUG-2001] 


249..457 
1..209 


208/209 (99%) 
209/209 (99%) 


e-121 


AAW99455 


Maize adenylosuccinate 
synthetase - Zea mays, 484 
aa. [US5882869-A, 
16-MAR-1999] 


24..454 
S3..482 


217/436 (49%) 
310/436 (70%) 


e-119 


In a BLAST search of public sequence datbases, the NOV12a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 12D. 


Table 12D. Public BLASTP Results for NOV12a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV12a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


BAC04649 


CDNA FLJ38602 fis, clone 
HEART2003836, highly 
similar to 

ADENYLOSUCCINATE 
SYNTHETASE, MUSCLE 
ISOZYME (EC 6.3.4.4)- 
Homo sapiens (Human), 457 
aa 


1..457 
1..457 


456/457(99%) 
457/457(99%) 


0.0 



149 
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PCT/US02/31373 



P28650 


Adenylosuccinate synthetase, 
muscle isozyme (EC 6.3.4,4) 
(IMP- aspartate ligase) 
(ADSS) (AMPS ASE) - Mus 
musculus (Mouse), 457 aa. 


1..457 S 
1..457 


453/457 (98%) 


__t««jf WilliAi .1^ t> 

0.0 


AJMSDS 


adenylosuccinate synthase 
(EC 6.3.4.4), muscle - mouse, 
452 aa. 


1..425 
1..425 


411/425 (96%) 
421/425 (98%) 


0.0 


AAH32039 


Similar to 

ADENYLOSUCCINATE 
SYNTHETASE, MUSCLE 
ISOZYME 
(IMP-ASPARTATE 
LIGASE) (ADSS) 
(AMPSASE) - Homo sapiens 
(Human), 502 aa (fragment). 


64..457 
109..5UZ 


392/394 (99%) 
j5y4/jy4 (yyyo) 


0.0 


Q9CQL9 


Adenylosuccinate synthetase 
(EC 6.3.4.4) (IMP-aspartate 
ligase) (ADSS) (AMPSase) - 
Mus musculus (Mouse), 456 
aa. 


8..457 
4..456 


345/453 (76%) 
399/453 (87%) 


0.0 



PFam analysis predicts that the NOV12a protein contains the domains shown in the 
Table 12E. 
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Table 12E. Domain Analysis of NOV12a 


Pf am Domain 


NOV12a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Ald_Xan_dh_C 


396..411 


8/16 (50%) 
14/16(88%) 


0.43 


Adenylsucc_synt 


32..455 


261/431 (61%) 
417/431 (97%) 


0 



Example 13. 

10 The NOV13 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 13A, 



Table 13A. NOV13 Sequence Analysis 
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SEQ ID NO: 65 {278 bp 




NOV13a, 
CG144686-01 
DNA Sequence 


TGCCTGTGGGTTTGATTGCTACCACTCTTGCAATTGCTCCTGTCCGCTTTGACAGGGAGAAGGTGTT 
CCGCGTGAAGCCTCAGGATGAAAAACAAGCAGACATCATAAAGGACTTGGCCAAAACCAGTGAGCTC 
CGAGATAAAGGCAAATTTGGTTTTCTCCTTCCAGAATCCCGGATAAAGCCAACGTGCAGAGAGACCA 
TWT AfiCTGTf! AA AT TT ATTGCCAAGTATATCCTCAAGCATACTTCCTAAAGAAC TGCCCTCTGTTT 
GGAATAAGCC 


i 


ORFStart:at3 j |ORF Stop: TAA at 249 





SEQ ID NO: 66 J82 aa jMW at 9327.9kD 


NOV13a, 
CG144686-01 
Protein Sequence 


pvgl:i^tt:laiapvrfdrekotrvkpqdekqa^ 

LAVKPIAKYILKHTS 





SEQ ID NO: 67 (268 bp | 


NOV13b, 
278690008 DNA 
Sequence 


caccggatccacccctgtgggtttgattgctaccactcttgcaattgctcctgtccgctttgacagg 
gagaaggtgttccgcgtgaagcctcaggatgaaaaacaagcagacatcataaaggacttggccaaaa 

CGAGTGAGCTCCGAGATAjy^GGCAAATT/TGGTTTTCTCCTTCCAGAA 
CAGAGAGACCATGCTAGCTGTCAAATOTATTGCC^ 




ORF Start: at 2 |ORF Stop: end of sequence 





SEQ ID NO: 68 89 aa |MW at 9973.6kD 


NOV13b T 
278690008 
Protein Sequence 


TGSTPVGLIATTIAIAPVRFDREKWRVKPQDEKQAI5IIKDLAKTSELRD 
RETMLAVKFIAKYILKHTSLEG 





SEQ ID NO: 69 94 bp X 


NOV13c, 
278690035 DNA 
Sequence 


CACCGGATCCACCAGTGAGCTCCGAGATAAAGGCAAATTTGGTTTTCTCCTTCCAGAATCCCGG 
AAGCCAACGTGCAGAGAGCTCGAGGGC 




ORF Start: at 2 ORF Stop: end of sequence 





|SEQIDNO:70 31 aa 


|MWat3452.9kD 


NOV13c, 


jTGSTS^RDKGKFGFLLPESRIKPTCRELEG 
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278690035 Protein Sequence 



NOV13d, 
CG144686-02 
DNA Sequence 


SEQIDNO:71 jl622bp 


ATGAGGCTCATCCTGCCTGTGGGTTTCATTGCTACCACTCTTGCAATTGCTCCTGTCCGCTTTGACA 

GGGAGAAGGTGTTCCGCGTGAAGCCCCAGGATGAAAAACAAGCAGACAT<^T^ 

AACCAATGAGCTTGACTTCTGGTATCCAGGTGCCACCCACCACGTAGCTGCTAATATGATGGTGGAT 

TTCCGAGTTAGTGAGAAGGAATCCCAAGCCATCCAGTCTGCCTTGGATCAAAATAAAATGCACTATG 

AAATCTTGATTCATGATCTACAAGAAGAGATTGAGAAACAGTTTGATGTTAAAGAAGATATCCCAGG 

CAGGCACAGCTACGCAAAATACAATAATTGGGAAAAGATTGTGGCTTGGAC0X5AAAAGATGATGGAT 

AAGTATCCTGAAATGGTCTCTCGTATTAAAATTGGATCTACTGTTGAAGATAATCCACTATATGTTC 

TGAAGATTGGGGAAAAGAATGAAAGAAGAAAGGCTATTTTTATGGATTGTGGCATTCACGCACGAGA 

ATGGGTCTCCCCAGCATTCTGCGAGTGGTOTGTCT 

ATTATGACCAAACTCTTGGACCGAATGAATTTTTACAOT 

TTTGGTCATGGAC^AAGAACCGCATGTGGAGAAAAAATCGTTCCAAGAACCAAAACTCCAAATGCAT 
CGGC AC TGACC TCAACAGGAATTTTAATGCTTC ATGGAACTC C ATTCCTAAC ACCAATGACCC ATGT 
GCAGATAACTATCGGGGCTCTGCACCAGAGTCCGAGAAAGAGACGAAAGCTGTCACTAATTTCATTA 
GAAGCCACCTGAATGAAATCAAGGTTTACATCACCTTCCATTCCTACTCCCAGATGCTATTGTTTC^ 
CTATGGATATACATCAAAACTGCCACCTAACCATGAGGACTTGGCCAAAGTTGCAAAGATTGGCACT 
GATGTTCTATCAACTCGATATGAAACCCGCTACATC 

TATC^GGTTCTTCTTTAGACTGGGCTTATGACCTGGGCATCAAACAC^CATTTGCCTTTGAGCTCCG 
AGATAAAGGCAAATTTpGTTTTCTCCTTCCAGAATCCCGGATAAAGCCAACGTGCAGAGAGACCATG 
CTAGCTGTCAAATTTATTGCCAAGTATATCCTCAAGCATACTTCCTAAAGAACTGCCCTCTGTTTGG 
AATAAGCCAATTAATCCTTTTTTGTGCCTTTCATCAGAAAGTCAATCTTCAGTTATCCCCAAATGCA 


GCTTCTATTTCACCTGAATCCTTCTCTTGCTCATTTAAGTCCCATGTTACTGCTGTTTGCTTTTACT 


TACTTTCAGTAGCACC^TAACGAAGTAGCTTTAAGTGAAACCTTTTAACTACCTTTCTTTGCTCCAA 


GTGAAGTTTGGACCCAGCAGAAAGCATTATTTTGAAAGGTGATATACAGTGGGGCACAGAAAACAAA 


TGAAAACCCTC AGTTTCTCAC AGAT TTTCACCATGTGGC TTC ATC AATTTATGTGCTAATAC AATAA 


AATAAAATGCACTT 




ORF Start: ATG at 1 | |ORF Stop: TAA at 1252 





SEQ ID NO: 72 |417 aa }MW at 48699.4kD 


NOV13d, 
CG144686-02 
Protein Sequence 


MRLI LPVGLI ATTLAX&PVRFDREKVFRVKPQDEKQADI IKDLAKTNELDFWYPGATHHVAANMMVD 

FRVSEKESQAIQSALDQNKMHYEILIHDLQEEIEKQFDVKEDIPGRHSYAKYNNWEKIVA 

KYPEMVSRIKIGSWEENPMVLKIGEKNER^^ 

IMTKLLDRMNFYILPVFNVDGYIWSWTKNRMWRKl^ 

ADNYRGSAPESEKETKAVTOTIRSHLiraiKW 

DVLSTRYETRYIYGPIESTIYPISGSSLDWAYDLGIKHTFAFELRDKGKFGFLLPESRIKPTCTETM 
LAVKFIAKYILKHTS 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 13B. 



Table 13B. Comparison of NOV13a against NOV13b through NOV13d\ 


Protein Sequence 


NOV13a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV13b 


1..82 
5..86 


82/82(100%) 
82/82(100%) 
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NOVBc 


41..65 
4..28 


25h5Tiow 

25/25 (100%) 


NOV13d 


1..44 
6..49 


43/44(97%) 
44/44 (99%) 



Further analysis of the NOV13a protein yielded the following properties shown in 
Table 13C. 
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Table 13C Protein Sequence Properties NOV13a 


PSort analysis: 


0.5500 probability located in endoplasmic reticulum (membrane); 0.1900 
probability located in lysosome (lumen); 0.1000 probability located in 
endoplasmic reticulum (lumen); 0.1000 probability located in outside 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV13a protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 13D. 



Table 13D. Geneseq Results for NOV13a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV13a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAU84325 


Protein CPA3 differentially 
expressed in breast cancer 
tissue - Homo sapiens, 417 
aa. [WO200210436-A2, 
07-FEB-2002] 


1..44 
6..49 


43/44 (97%) 
44/44 (99%) 


2e-17 


AAG75369 
* 


Human colon cancer antigen 
protein SEQ ID NO:6133 - 
Homo sapiens, 180 aa. 
[WO200122920-A2, 
05-APR-2001] 


43..82 
141..180 


40/40 (100%) 
40/40 (100%) 


9e-17 


AAU04477 


Porcine carboxypeptidase B 
(CpB) protein - Sus scrofa, 
306 aa. [WO200151624-A2, 
19-JTJL-2001] 


41..80 
266..305 


25/40(62%) 
34/40 (84%) 


4e-10 


AAR75132 


Porcine carboxypeptidase B - 
Sus scrofa. 306 aa. 


41.-80 
266..305 


25/40(62%) 
34/40(84%) 


4e-10 
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[WO9514096-A1, 
26-MAY-1995] 








AAR75131 


Porcine Tyr-His-Met 
Procarboxypeptidase B - Sus 
scrofa, 404 aa. 
rWO9514096-Al, 
26-MAY-1995] 


41..80 
364..403 


25/40 (62%) 
34/40(84%) 


4e-10 



In a BLAST search of public sequence datbases, the NOV13a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 13E. 



Table 13E. Public BLASTP Results for NOV13a 


Protein 

Accession 

Number 


jrroiciwv^rganisnvjueagEfi 


NOV13a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P15088 


Mast cell carboxypeptidase A 
precursor (EC 3.4.17.1) 
(MC-CPA) (Carboxypeptidase 
A3) - Homo sapiens (Human), 
417 aa. 


1..44 
6..49 


43/44(97%) 
44/44 (99%) 


5e-17 


P97597 


Mast cell carboxypeptidase A 
precursor - Rattus norvegicus 
(Rat), 412 aa (fragment). 


43..82 
373..412 


37/40(92%) 
39/40(97%) 


le-14 


P21961 


Mast cell carboxypeptidase 
(EC 3.4.17.1) (RMC-CP) 
(Carboxypeptidase A3) - Rattus 
norvegicus (Rat), 309 aa. 


43..82 
270..309 


37/40 (92%) 
39/40(97%) 


le-14 


P15089 


Mast cell carboxypeptidase A 
precursor (EC 3.4.17.1) 
(MC-CPA) (Carboxypeptidase 
A3) - Mus musculus (Mouse), 
417 aa. 


43..82 
378..417 


36/40(90%) 
39/40(97%) 


7e-14 


P00732 


Carboxypeptidase B (EC 
3.4.17.2) - Bos taurus (Bovine), 
306 aa. 


41..80 
266..305 


26/40(65%) 
36/40(90%) 


7e-ll 



PFam analysis predicts that the NOV13a protein contains the domains shown in the 
Table 13F. 
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Table 13F. Domain Analysis of NOV13a 


Pfam Domain 


NOV13a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


Zn_carbOpept 


41..65 


16/30(53%) 
24/30(80%) 


5.6e-08 



Example 14. 

The NOV14 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 14A. 



NOV14a, 
CG144906-01 
DNA Sequence 



Table 14A. NOV14 Sequence Analysis 



lSEQroNO: 7T 



829 bp 



GCCCTTCGCGGGAGAGGAGGCCATGGGCGCGCGCGGGGCGCTGCTGCTGGCGCTGCTGCTGGCTCGG 



GCTGGACTCAGGAAGCCGGAGTCGCAGGAGGCGGCGCCCTTATCAGGACCATGCGGCCGACGGGTCA 
TCACGTCGCGCATCGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCCTGCG 
CCTGTGGGATTCCCACGTATGCGGAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGCGCAC 
TGCTTTGAAACCTATAGTGACCTTAGTGATCCCTCCGGGTGGATGGTCCAGTTTGGCCAGCTGACTT 
CCATGCCATCCTCCACATTTGAGTTTGAGAACCGGACAGACTGCTGGGTGACTGGCTGGGGGTACAT 
C AAAGAGGATG AGGC AC TG C CATC TCCC C ACAC C CTC CAGG AAGTTC AGGTCGCCAT CATAAAC AAC 
TCTATGTGCAACCACCTCTTCCTCAAGTAC^GTTTCCGCAAGGAC^TCTTTGGAGACATGGTTTGTG 
CTGGCAATGCCCAAGGCGGGAAGGATGCCTGCTTCGGTCACTC^GGTGGACCCTTGGCCTGTAACAA 
GAATGGACTGTGGTATCAGATTCGAGTCGTCAGCTGGGGAGTGGGCTGTGGTCGGCCCAATCGGCCC 
GGTGTCTACACCAATATCAGCCACCACTTTGAGTGGATCCAGAAGCTGATGGCCCAGAGTGGCATGT 
CCCAGCCAGACCCCTGCTGGCCACTACTCTTTTTCCCTCTTCTCTGGGCTCTCCCACTCCTGGGGCC 
GGTCTGAGCCTACCTGAGCCCATGC 



ORF Start: ATG at 23 



ORFStop: TGA at 809 





SEQE>NO:74 


262 aa fMW at 28826.7kD 


NOV14a, 
CG144906-01 
Protein Sequence 


MGARGALLLALLLARAGLRKPESQEAAPLSGPCGRRVITSRIVGGE 

GVSLLSHRWALTAAHCFETYSDLSDPSGWMVQFGQLTSMPSSTF^^ 

SPHTLQEVQVAinOTSMCNHLFIJCYSFRKDI^^^ 

GWSWGVGCGRPNRPGVYTNISHHFEWIQKLMAQSGMSQPDPSWPLLFFPLLWALPLLGPV 





SEQIDNO:75 


989 bp | 


NOV14b, 
CG144906-02 
DNA Sequence 


AATCGCCCTTCGCGGGAGAGGAGGCCATGGGCGCGCGCGGGGCGCTGCTGCTGGCGCTGCTGCTGGC 
TCGGGCTGGACTCAGGAAGCCGGAGTCGCAGGAGGCGGCGCCCTTATCAGGACCATGCGGCCGACGG 
GTCATCACGTCGCGCATCGTGGGTGGAGAGGACGCCGAACTCGGGCGTTGGCCGTGGCAGGGGAGCC 
TGCGCCTGTGGGATTCCCACGTATGCGGAGTGAGCCTGCTCAGCCACCGCTGGGCACTCACGGCGGC 
GCACTGCTTTGAAACCTATAGTGACCTTAGTGATCCCTCCGGGTGGATGGTCCAGTTTGGCCAGCTG 
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Ai_ 1 iCUAl^i-C^TCCTTCTGGAGCCTGCAGGCC 

TGAGCCC TCGCTACCTGGGGAATTCACCCTATGACATTGC CTTGGTGAAGCTGTCTGC ACC TGTCAC 
CTACACTAAACACATCCAGCCCATCTGTCTCCAGGCCTCCACATTTGAGTTTGAGAACCGGACAGAC 
TGCTGGGTGACTGGCTGGGGGTACATCAAAGAGGATGAGGCACTGCCATCTC 

AAGTTCAGGTCGCCATCATAAACAACTCTATGTGCAACCACCTCTTCCTCAAGTACAGTTTCCGCAA 
GGACATCTTTGGAGACATGGTTTGTGCTGGCAATGCCCAAGGCGGGAAGGATGCCTGCTTCGGTGAC 
TCAGGTGGACCCTTGGCCTGTAACAGGAATGGACTGTGGTATCAGATTGGAGTCGTGAGCTGGGGAG 
TGGGCTGTGGTCGGCCCAATCGGCCCGGTGTCTACACCAATATCAGCCACCACTTTGAGTGGATCCA 
GAAGCTGATGGCCCAGAGTGGCATGTCCCAGCCAGACCCCTCCTGGCCACTACTCTTTTTCCCTCTT 
CTCTGGGCTCTCCCACTCCTGGGGCCGGTCTGAGCCTACCTTAGCCCATGC 




ORF Start: ATG at 27 j 


ORFStop:TGAat969 





SEQ ID NO: 76 314 aa MW at 3491 1.6kD 


NOV14b, 
CG144906-02 
Protein Sequence 


MGARGALLLAIJ^LARAGLRKPESQEAAPLSGPCGRRV^ 

GVSLLSHRWALTAAHCFETYSDLSDPSGWMVQFGQLTSMPSFWSLQAYYTRYFVSNIYIiSPRYLGNS 
FTOIALVKLSAPVTYTKHIQPIClXJASTPEFEimTDCWVTGWGYIKEDEALPSPH 
SMCNHLFLKYSFRKDIFGD1WCAGNAQGGKDACFGDSGGPLACNRNGLWYQIGW 
GVYTNISHHFEWIQKLMAQSGMSQPDPSWPLLFFPLLWALPLLGPV 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 14B. 



Table 14B. Comparison of NOV14a against NOV14b. 


Protein Sequence 


NO V14a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV14b 


20..240 
20..292 


219/273 (80%) 
221/273 (80%) 



Further analysis of the NOV14a protein yielded the following properties shown in 
Table 14C. 



Table 14C. Protein Sequence Properties NOV14a 


PSort analysis: 


0.5422 probability located in outside; 0.4639 probability located in lysosome 
(lumen); 0.2779 probability located in microbody (peroxisome); 0.1900 
probability located in plasma membrane 


SignalP analysis: 


Cleavage site between residues 20 and 21 
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A searcft ol the JNUV14a protein against the UenesSq Mat^Ms^-f^pietafy^ 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 14D. 



Table 14D. Geneseq Results for NOV14a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV14a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAE17010 


Human eosinophil serine 
protease-1 (esp-1) like 
enzyme #2 - Homo sapiens, 
314 aa. [WO200198503-A2, 
27-DEC-2001] 


1..262 
1..314 


262/314(83%) 
262/314(83%) 


e-154 


AAB80256 


Human PRO303 protein - 
Homo sapiens, 314 aa. 
[WO200104311-A1, 
18-JAN-2001] 


1..262 
1..314 


262/314(83%) 
262/314(83%) 


e-154 


AAU01569 


Human secreted protein 
immunogenic epitope 
encoded by gene #9 - Homo 
sapiens, 315 aa. * 
[WO200123547-A1, 
05-APR-2001] 


1..262 
1..314 


262/314 (83%) 
262/314(83%) 

1 


e-154 


AAU02223 


Human extracellular serine 
protease TADG-16 - Homo 
sapiens, 314 aa. 
[WO200127257-A1, 
19-APR-2001] 


1..262 
1..314 


262/314 (83%) 
262/314(83%) 


e-154 


AAY91871 


Human cancer-specific gene 
protein, Prol04 - Homo 
sapiens, 327 aa. 
[WO200016805-A1, 
30-MAR-2000] 


1..262 
14..327 


262/314(83%) 
262/314(83%) 


e-154 



In a BLAST search of public sequence datbases, the NOV14a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 14E. 



Table 14E. Public BLASTP Results for NOV14a 
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Protein 
Number 


x roLein/i^rgaiiisuui^cngiJi 


NOV14a ! 
Residues/ 
Match 
Residues 


Id&ifitiesr 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9Y6M0 


Testisin precursor (EC 
3.4.21.-) (Eosinophil serine 
protease 1) (ESP- 1) - Homo 

<?anipTK fHiimarrt ^14 aa 


1..262 
1..314 


262/314 (83%) 
262/314 (83%) 


e-154 


Q9JHJ7 


Testisin precursor (EC 
3.4.21.-) (Tryptase 4) - Mus 

miiQfiiliiQ fJVTrvi'KP^ *V?A. aa 

IJUlLlOl/UlUO ^l.VJ.ULIoCy, _f^#*-t aa. 


1..261 
1..323 


179/326 (54%) 
210/326 (63%) 


le-98 


Q920S2 


Testis serine protease-1 - Mus 
musculus (Mouse), 322 aa. 


1..261 
1..321 


150/325 (46%) 
180/325 (55%) 


2e-69 


Q9D4I3 


4931440B09Rik protein - 
Mus musculus (Mouse), 282 
aa. 


32..261 
2..281 


135/283 (47%) 
161/283 (56%) 


le-66 


Q9PVX7 


Epidermis specific serine 
protease - Xenopus laevis 
(African clawed frog), 389 aa. 


33..244 
17..277 


100/264 (37%) 
136/264 (50%) 


3e^5 



PFam analysis predicts that the NOV14a protein contains the domains shown in the 
Table 14F. 

5 



Table 14F. Domain Analysis of NOV14a 


Pfam Domain 


NOV14a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


trypsin 


42..85 


24/51 (47%) 
36/51 (71%) 


2.3e-13 


trypsin 


119..229 


52/121 (43%) 
92/121 (76%) 


9e-43 



Example 15. 

10 The NOV15 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 15 A. 



Table 15A.NOV15 Sequence Analysis 

jSEQIDNO:77 j716bp ) 



158 



WO 03/029424 PCTAJS02/31373 



NOV15a, 
CG144997-01 
f)NA SpmiPTire 


GCGGCTCTCGCGGGTTCGGGATGTTCTATGCCGTGAGGAGGGGCCGCAAGACCGGGGTCTTTCTGAC 
CTGGAATGAGTGCAGAGACACGTTTTCCTACATGGGAGACTTCGTC 

TGCTCCAGTAATGGGCGTAGAAGGCCGCGAGCAGGAATCGGCGTTTACTGGGGGCCGGGCCATCCTT 

TAAATGTAGGCATTAGACTTCCTGGGCGGCAGACAAACCAAAGAGCGGAAATTCATGCAGCCTGCAA 

AGCCATTGAACAAGCAAAGACTCAAAACATCAATAAACTGGTTCTGTATACAGACAGTATGTTTACG 

ATAAATGGTATAACTAACTGGGTTCAAGGTTGGAAGAAAAATGGGTGGAAGACAAGTGCAGGGAAAG 

AGGTGATCAACAAAGAGGACTTTGTGGCACTGGAGAGGCTTACCCAGGGGATC 

GCATGTTCCTGGTCATTCGGGATTTATAGGCAATGAAGAAGCTGACAGATTAGCCAGAGAAGGAGCT 

AAACAATCGGAAGACTGAGCCATGTGACTTTAGTCCTTGGGAGAACTTGAGCCAGCGGCTGTCTTGC 


TGCCTGTACTTACTGGTGTGGAAAATAGCCTGCAGGTAGGACCATT 




ORF Start: ATG at 10 ) 


ORFStop: TGA at 619 





SEQIDNO: 78 [203 aa 


MWat22889.0kD 


NOV15a, 
CG144997-01 
Protein Sequence 


MSWFLFLAHRVALAALPCRRGSRGFGMFY^ 

NGRMIPRAGIGVYWGPGHPL1WGIRLPGRQTNQRAEIHAACKAIEQAKTO 

ITNWQGWKKNGWKTSAGKEVII^EDFVALERLTQGMDIQWMHVPGHSG 

ED 





SEQIDNO: 79 |631 bp | 


NOVlSb, 
278693648 DNA 
Sequence 


C^CCGGATCCACCATGAGCTGGTTTCTGTTCCTGGCCCACAGAGTCGCCTTGGCCGCCTTGCCCTGC 
CGCCGCGGCTCTCGCGGGTTCGGGATGTTCTATGCCGTGAGGAGGGGCCGCAAGACCGGGGTCTTTC 
TGACCTGGAATGAGTGCAGAGACACGTTTTCCTACA^^ 
CTGCTGCTCCAGTAATGGGCGTAGAAGGCCGCGAGCAGGAATCGGCGTOTA 

CCT TTAAATGTAGGCATTAGACT TC C TGGGCGGC AGACAAACC AAAGAGCGGAAATTCATGCAGCC T 
GCAAAGCCATTGAACAAGCAAAGACTCAAAACATCAAT 

TACGATAAATGGTATAACTAACTGGGTTCAAGGTTGGAAGAAAAATGGGTGGAAGACAAGTGCAGGG 
AAAGAGGTGATCAAC AAAGAGGACTTTGTGGCAC TGGAGAGGC T TAC CC AGGGG ATGGACATTCAGT 
GGATGCATGTTCCTGGTCATTCGGGATTTATAGGCA^ 
AGCTAAACAATCGGAAGACCTCGAGGGC 




ORF Start: at 2 joRF Stop: end of sequence 



NOVlSb, 
278693648 
Protein I 



SEQIDNO: 80 



jMWat23534 6kD 



TGSTMSWFLFLAHRVALAALPCRRGSRGTOMFYAVRRGRKTC 

CCSSNGRl^Pl^GIGVYWGPGHPLNVGIRL 

TINGITNOTQGWKKNGWKTSAGKEVINK^F^^ 



Sequence akqsedleg 





SEQIDNO: 81 j586bp 




NOVISc, 
278480974 DNA 
Sequence 


CACCGGATCCGCCTTGCCCTGCCGCCGCGGCTCTCGCGGGTTCGGGATGTTCTATGCCGTGAGGAGG 
GGCCGCAAGACCGGGGTCTTTCTGACCTGGAATGAGTGC^ 

TCGTCGTCGTCTACACTGATGGCTGCTGCTCCAGTAATGGGCGTAGAAGGCCGCGAGCAGGAATCGG 
CGTTTACTGGGGGCCGGGCCATCCTTTAAATGTAGGC^ 

AGAGCGGAAATTCATGCAGCCTGCAAAGCCATTGAACAAGCAAAGACTCAAAACATCAAT 

TTCTGTATACAGACAGTATGTTTACGATAAATGGTATAACTAACTGGGTTCAAGGTTGGA^ 

TGGGTGGAAGACAAGTGCAGGGAAAGAGGTGATCAACAA^ 
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ACCCAGGG(^TGGACATTCAGTGGATGCATGTTC [ 
CTGACAGATTAGCCAGAGAAGGAGCTAAACAATCGGAAGACCTCGAGGGC 




ORF Start: at 2 |ORF Stop: end of sequence 





SEQIDNO:82 ]l95aa 


MWat21789.5kD 


NOV15c, 
278480974 
Protein Sequence 


TGSALPCRRGSRGFGOTYAVRRGRKTGVFLTWOT^ 
VYWGPGKPLOTGIRLPGRQTNQRAEIHAACKA^ 

GWKTSAGKWIl^EDF VALERLTQGMDI QWMHVPGHSGF I GNEEADRLAR EGAKQS EDLEG 





SEQIDNO:83 


457 bp | 


NOV15d, 
278498047 DNA 
Sequence 


CACCGGATCCGGAGACTTC GTCGTCGTCTACAC TGATGGC TGC TGCTCC AGTAATGGGCGTAGAAGG 
CCGCGAGCAGGAATCGGCGTTTACTGGGGGCCGGGCCATCCTTTAAATGTAGGCATTAGACTTCCTG 
GGCGGCAGACAAACCAAAGAGCGGAAATTCATGCAGCCTGCAAAGCCATTGAACAAGCAAAGACTCA 
AAACATCAATAAACTGGTTCTGTATACAGACAGTATGTTTACGATAAATGGTATAACTAACTGGGTT 
CAAGGTTCGAAGAAAAATCGG0X3GAAGACAAGTGCAGGGAAAGAGGTGATCAACAAAGAGGACTTTG 
TGGCACTGGAGAGGCTTACCCAGGGGATGGACATTCAGTGGATGCATGTTCCTGGTCATTCGGGATT 
TATAGGCAATGAAGAAGCTGACAGATTAGCCAGAGAAGGAGCTAAACTCGAGGGC 




ORF Start: at 2 joRF Stop: end of sequence 





SEQ ID NO: 84 _[l52 aa |MW at 16753.8kD 


NOV15d, 
278498047 
Protein Sequence 


TCSGDFWVYTDGCCSSNGRRRPRAGIG\nWGPGHPLOTGI^^ 

NINKLVLYTDSl^TINGITNWQGWKKNGWKT 

IGNEEADRLAREGAKLEG 





SEQ ID NO: 85 


965 bp 




NOV15e, 
CG144997-02 
DNA Sequence 


GAGTGAGCGATGAGCTGGTTTCTGTTCCTGGCCCACAGAGTCGCCTTGGCCGCCTTGCCCTGCCGCC 

GCGGCTCTCGCGGGTTCGGGATGTTCTATGCCGTGAGGAGGGGCCGCAAGACCGGGGTCTTTC'PGAC 

CTGGAATGAGTGCAGAGC ACAGGTGGACCGGTTTCC TGC TGCCAGATTTAAGAAGTTTGCCACAGAG 

GATGAGGCCTGGGCCTTTGTCAGGAAATCTGGZ^GCCCGGAAGTTTCAGAAGGGCATGAAAATCAAC 

ATGGACAAGAATCGGAGGCGAAAGCCAGCAAGCGACTCCGTGAGCCACTGGATGGAGATGGACATGA 

AAGCGCAGAGCCGTATGCAAAGCACATGAAGCCGAGCGTGGAGCCGGCGCCTCCAGTTAGCAGAGAC 

ACGTTTTCCTACATGGGAGACTTCGTCGTCGTCTACACTGATGGCTGCTGCTCCAGTAATGGGCGTA 

GAAGGCCGCGAGCAGGAATCGGCGTTTACTGGGGGCCAGGCCATCCTTTAAATGTAGGCATTAGACT 

TCCTGGGCGGCAGACAAACCAAAGAGCGGAAATTCM^ 

ACTCAAAACATC^TAAACTGGTTCTC^ 

GGGTTCAAGGTTGGAAGAAAAATGGGTGGAAGACAAGTGCAGGGAAAGAGGTGATCAACAAAGAGGA 
CTTTGTGGCACTGGAGAGGCTTACCCAGGGGA 

GGATTTATAGGCAATGAAGAAGCTGAGAGATTAGCCAGAGAAGGAGCTAAACAATCGGAAGACTGAG 
CCATGTGACTTTAGTCCTTGGGAGAACTTGAGCCAGCGGCTGTCTTGCTGCCTGTACTTACTGGTGT 


GGAAAATAGCCTGCAGGTAGGACCATT 




ORF Start: ATG at 10 




ORF Stop: TGA at 868 
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SEQ ID NO: 86 |286 aa at 32098.0kD 


NOV15e, 
CG144997-02 
Protein Sequence 


MSWFLFIiAHRVAIJVALPCRRGSRGFGMFYAVRRGRKTGVT 
WAFVRKSASPEVSEGHENQHGQESEAKASKRLR^ 

YMGDFVWYTDGCC S SNGRRRPRAGIGVYWGPGHPLNVG IRL PGRQTNQRAEIHAACKAI EQAKTQN 

INKLVLYTDSMFTINGITNOTQGWKKNGWKTSA 

GNEEADRLAREGAKQSED 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 15B. 



Table 15B. Comparison of NOV15a against NOV15b through NOV15e. 


Protein Sequence 


NOV15a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV15b 


1..203 
5..207 


203/203 (100%) 
203/203 (100%) 


NOV15c 


14..203 
3.. 192 


189/190 (99%) 
190/190 (99%) 


NOV15d 


54.. 199 
4.. 149 


146/146 (100%) 
146/146 (100%) 


NOV15e 


47..203 
130..286 


157/157 (100%) 
157/157 (100%) 



Further analysis of the NOV15a protein yielded the following properties shown in 
Table 15C. 



Table 15C. Protein Sequence Properties NOV15a 


PSort analysis: 


0.3700 probability located in outside; 0.1805 probability located in microbody 
(peroxisome); 0.1080 probability located in nucleus; 0.1000 probability 
located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 15 and 16 



A search of the NOV15a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 15D. 
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Table 15D. Geneseq Results for NOVlSa 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV15a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAY70235 


Human RNA-associated 
protein-16 (RNAAP-16) - 
Homo sapiens, 286 aa. 
[WO200011171-A2, 
02-MAR-2000] . 


47..203 
130..286 


157/157 (100%) 
157/157 (100%) 


6e-92 


AAB97508 


Human type II RNase H 
protein - Homo sapiens, 286 
aa. fWO200123613-Al 
05-APR-2001] 


47..203 
130..286 


156/157 (99%) 
157/157 (99%) 


le-91 


AAY25094 


Human type 2 RNase H 
protein - Homo sapiens, 286 
aa. [W09928447-A1, 
10-JUN-1999] 


47..203 
130..286 


156/157 (99%) 
157/157 (99%) 


le-91 


ABB83371 


Human wild-type RNase HI 
- Homo sapiens, 286 aa. 
[WO200240635-A2, 
23-MAY-2002] 


47..203 
130..286 


156/157 (99%) 
156/157 (99%) 


2e-90 


ABB83374 


Mutant RNase HI, E186Q - 
Homo sapiens, 286 aa. 
[WO200240635-A2, 
23-MAY-2002] 


47..203 
130..286 


155/157 (98%) 
156/157 (98%) 


5e-90 



In a BLAST search of public sequence datbases, the NOVlSa protein was found to 
have homology to the proteins shown in the BLASTP data in Table 15E. 



Table 15K Public BLASTP Results for NOVlSa 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVlSa 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


060930 


Ribonuclease HI (EC 
3.1.26.4) (RNase HI) 
(Ribonuclease H type II) - 
Homo sapiens (Human), 286 
aa. 


47..203 
130..286 


157/157 (100%) 
157/157 (100%) 


2e-91 
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Q8VCR6 


Ribonuclease HI - Mus 
musculus (Mouse), 285 aa. 


47..203 8 ' 
129..285 


150/157(95%) 




070338 


Ribonuclease HI (EC 
3.1.26.4) (RNase HI) - Mus 
musculus (Mouse), 285 aa. 


47..203 
129..285 


139/157 (88%) 
150/157 (95%) 


5e-83 


Q91953 


mRNA, complete cds, clone 
CLFEST65 - Gallus gallus 
(Chicken), 293 aa. 


50..202 
140..292 


117/153(76%) 
135/153 (87%) 


4e-70 


Q21024 


F59A6.6 protein - 
Caenorhabditis elegans, 369 
aa. 


58..199 
222.363 


65/142 (45%) 
93/142(64%) 


3e-32 



PFam analysis predicts that the NOV 1 5a protein contains the domains shown in the 
Table 15F. 



Table 15F. Domain Analysis of NOVlSa 


Pfam Domain 


NOV15a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


rnaseH 


54..199 


65/176 (37%) 
125/176(71%) 


2.8e-54 



Example 16. 

The NOV 16 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 16A. 



Table 16A. NOV16 Sequence Analysis 




SEQIDNO:87 |2274bp 




NOV16a, 
CG145494-01 
DNA Sequence 


CCCCTAGTGACACTCAGGAAATGCTTGTCTCCGGCTGTTAAGGAATAATTTCAGAGTACTATGGATC 


ATGCTGAAGAAAATGAAATCCTTGCAGCAACCCAGAGGTACTATGTGGAAAGGCCTATCTTTAGTCA 
TCCGGTCCTCCAGGAAAGACTACACACAAAGGACAAGGTTCCTGATTCCATTGCGGATAAGCTGAAA 
(^GGCATTC^CATGTACTCCTAAAAAAATAAGAAATATCATTTATATGTTCCTACCCATAACTAAAT 
GGCTGCCAGCATACAAATTCAAGGAATATGTGTTGGKSTGACTT^ 

GCTTCAGCTTCCTCAAGGCTTAGCCTTTGCAATGCTGGCAGCTGTGCCTCCAATATTTGGCCTGTAC 
TCTTCATTTTACCCTGTTATCATGTATTGTTTTCTTGGAACCTCCAGACAC^TATCC^TAGGTCCTT 
TTGCTGTTATTAGCCTGATGATTGGTGGTGTAGCTGTTCGATTAGTACCAGATGATATAGTCATTCC 
AGGAGGAGTAAATGCAACCAATGGCACAGAGGCCAGAGATGCCTTGAGAGTGAAAGTCGCCATGTCT 
GTGACCTTACTTTCAGGAATCATTCAGTTTTGrc 

ATCTCACAGAGCCTCTGGTCCGTGGGTTTACCACCGCAGCAGCTGTGCATGTCTTCACCTCCATGTT 
AAAATATCTGTTTGGAGTTAAAAC^UVAGCGGTACAGTGGAATCTTTTCCGTGGTGTATAGTAraGTT 
GCTGTGTTGCAGAATGTTAAAAACCTCAACGTGTGTTCCCTAGGCGTCGGGCTGATGGTTTTTGGTT 
TGCTGTTGGGTGGCAAGGAGTTTAATGAGAGATTTAAAGAGAAATTGCCGGCGCCTATTCCTTTAGA 
GTTCTTTGCGGTCGTAATGGGAACTGGCATTTCAGCTGGGTTTAACTTGAAAGAATCATACAATGTG 
GATGTCGTTGGAACACTTCCTCTAGGGCTGCTACCTCCAGCCAATCCGGACACCAGCCTCTTCCACC 
TTGTGTACGTAGATGCCATTGCCATAGCCATCGTTGGATTTTCAGTGACCATCTCCATGGCCAAGAC 
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CTTAGCAAATAAAGkTGGCTACCAGGTTGACGGC^ 

TCCATTGGCTCACTCTTCCAGACCTTTTCAATTTCATGCTCCTTGTCTCGAAGCCTTGTTCAGGAGG 
GAAC CGGTGGGAAG AGACAGGCTGTGCTGTCGGCCATTGTGATTGTCAACC TGAAGGGAATGTTTAT 
GCAGTTCTC^GATCTCCCCTTOTTCTGGAGAACCAGCAAAATAGAGCTGACCATCTGGCTTACCACT 
TTTGTGTCCTCCTTGTTCCTGGGATTGGACTATGGTTTGATCACTGCTGTGATCATTGCTCTGCTGA 
CTGTGATTTACAGAACACA.GAGTCCAAGCTACAAAGTCCTTGGAAAGCTTCCTGAAACTGATGTGTA 
TATTGATATAGACGCATATGAGGAGGTGAAAGAAATTCCTGGAATAAAAATATTTCAAATAAATGCA 
CCAATTTACTATGCAAATAGCGACTTGTATAGCAATGCATTAAAACGAAAGACTGGAGTGAACCCAG 
CAGTCATCATGGGAGCAAGGAGAAAGGCCATGCGGAAGTACGCTAAGGAAGTCGGAAATGCAAATAT 
GGCCAACGCAACTGTTGTCAAAGCAGATGCAGAAGTAGATGGAGAGGATGCTACCAAGCCTGAAGAA 
GAGGATGGTGAAGTAAAATATCCCCCAATAGTGATCAAAAGCACATTTCCTGAGGAAATGCAAAGAT 
TTATGCCCCCAGGGGATAACGTCCACACTGTCATTTTGGATTTCACTCAAGTCAATTTTATTGATTC 
TGTTGGAGTGAAAACTCTGGCAGGGATTGTAAAAGAATATGGAGACGTCGGTATATATGTATACTTA 
GCAGGATGCAGTGCACAAGTTGTGAATGACCTCACTCGGAATAGATTTTTTGAAAATCCTGCCCTAT 
GGGAGCTGCTGTTCCACAGC ATTCATGATGCAGTTTTAGGCAGC CAACTTAGAGAGGCACTTGC TGA 
ACAGGAAGC CTCGGCTCCCCCTTCCCAGGAGG ACTTGGAGCCCAATGC CAC TCCTGCC ACTC C TGAG 
GCATAGATGAGGACCTCACCCTAGGATGGGGTTATAAGCCTCTCATGAAGTTCATAATTTACA 




ORF Start: ATG at 61 | jORF Stop: TAG at 2215 





SEQIDNO: 88 


718 aa jMW at 78546.4kD 


NOV16a, 
CG145494-01 
Protein Sequence 


MDHAEE^5EILAATQRYYVERPIFSHPVLQERLHTKDKVPDS IADKLKQAFTCTPKKIRNI IYMFLPI 
TKWLPAVKFKKSfVLGDLVSGISTGVLQLPQGLAFA^ 

GPFAVISLMIGGVAVRLVPDDIVI PGGVNATNGTEARDALRVKVAMSVTLLSGI IQFCLGVCRFGFV 

AIYLTEPLTOGFTTAAAVHWTSMLKYLFG 

FGLLLGGKEFNERFKKKLPAPIPLEFFAVVMG^^ 

FHLVYVDAIAI AIVGFS VTI SMAKTLANKHGYQVDGNQELI ALGLC^SI GSLFQTFS I SC SLSRSLV 

QEGTGGKTQAVLSAXVIVNLKGMFMQFSDLPFFWRTSKIELTIWLTTFVSSLFLGLDYGLITAVIIA 

LLTVIYRTQSPSYKVLGKLPETDVYIDIDAYEEVI^ 

NPAVIMGARRKAMRKYAKEVGNANMANATVVKADAEVTO 

QRFMPPGDNVHOTILDFTQVNFIDSVGVKTL^ 

ALWELLFHSIHDAVLGSQLREALAEQEASAPPSQEDLEPNATPATPEA 



Further analysis of the NOV16a protein yielded the following properties shown in 
Table 16B. 



Table 16B. Protein Sequence Properties NOV16a 


PSort analysis: 


0.6000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3200 probability located in nucleus; 0.3000 probability located 
in endoplasmic reticulum (membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV16a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 16C. 



Table 16C. Geneseq Results for NOV16a 
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Geneseq 
IdentiQer 


Protein/Organism/Length 
[Patent*, Date] 


NOV16a s 
Residues/ 
Match 
Residues 


>TdeWtjyp y ^ 

Similarities for 
the Matched 
Region 


■■ 3JL3jt . 

Expect 
Value 


AAY71067 


Human membrane transport 
protein, MTRP-12 - Homo 
sapiens, 758 aa. 
[WO200026245-A2, 
ll-MAY-2000] 


9..6S4 
15..738 


291/741 (39%) 
433/741 (58%) 


e-148 


AAG67162 


Amino acid sequence of a 
human 32613 transporter 
polypeptide - Homo sapiens-, 
751 aa. [WO200164875-A2, 
07-SEP-2001] 


9..684 
15..731 


289/734 (39%) 
432/734(58%) 


e-147 


ABG61914 


Prostate cancer-associated 
protein #115 - Mammalia, 
790 aa. [WO200230268-A2, 
18-APR-2002] 


16..699 
20.:741 


268/723 (37%) 
419/723 (57%) 


e-144 


AAM51696 


Human pendrin SEQ ID NO 
2 - Homo sapiens, 780 aa. 
[JP2001228146-A, 
24-AUG-2001] 


16..699 
20..741 


268/723 (37%) 
419/723 (57%) 


e-144 


AAM51695 


Mouse pendrin SEQ ID NO 1 
- Mus sp, 780 aa. 
[JP2001228146-A, 
24-AUG-2001] 


16..688 
20..730 


270/713(37%) 
414/713 (57%) 


e-142 



In a BLAST search of public sequence datbases, the NOV16a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 16D. 



Table 16D. Public BLASTP Results for NOVloa 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV16a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P58743 


Prestin - Homo sapiens 
(Human), 744 aa. 


1..718 
L.744 


718/744(96%) 
718/744(96%) 


0.0 


Q9JKQ2 


Prestin - Meriones 
unguiculatus (Mongolian 
jird) (Mongolian gerbil), 744 
aa. 


L.718 
L.744 


679/744(91%) 
699/744(93%) 


0.0 


Q99NH7 

) 


Prestin - Mus musculus 
(Mouse), 744 aa. 


L.718 
L.744 


680/744(91%) 
700/744 (93%) 


0.0 
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Q9EPH0 


Prestiri - Rattus norvegicus 
(Rat), 744 aa. 


1-718 H 
1..744 


699/744 (92%) 




AAH28856 


Solute carrier family 26, 
member 6 - Mus museums 
(Mouse), 735 aa. 


16..684 
8..715 


282/718 (39%) 
432/718 (59%) 


e-148 



PFam analysis predicts that the NOV16a protein contains the domains shown in the 
Table 16E. 



Table 16E. Domain Analysis of NOV16a 


Pfam Domain 


NOV16a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


COX3 


334..458 


31/266(12%) 
80/266 (30%) 


0.7 


Sulfate_transp 


193..477 


111/328(34%) 
234/328 (71%) 


7e-78 


STAS 


500..683 


34/188 (18%) 
124/188 (66%) 


1.4e-12 



Example 17. 

The NOV17 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 17A. 



Table 17A-NOV 


17 Sequence Analysis 




SEQ ID NO: 89 J2124 bp j 


NOV17a, 
CG145722-01 
DNA Sequence 


AAGCTGAGGTC TTATAGATTGGTGGTACTTAAGGCAGAAAATTAACACCGTGTTTTGTAGC TGTTAG 


TTGGTAGAGGGAAATTCAGGCTACCGTCGCGAAACCTGCAGGTTAAGTTATTTTCTCCTCCCTGCTT 


CTGTAGGTTCACAGCGTTCCCTTCTGATAGAGCTTTTTC 


TGGATGACAAAGATATTGACAAAGAACTAAGGCAGAAATTAAACTTTTCCTATTGTGAGGAGACTGA 
GATTGAAGGGCAGAAGAAAGTAGAAGAAAGCAGGGAGGCTTCGAGCCAAACCCCAGAGAAGGGTGAA 
GTGCAGGATTCAGAGGCAAAGGGTACACCACCTTGGACTCCCCTTAGCAACGTGCATGAGCTCGACA 
CATCTTCGGAAAAAGACAAAGAAAGTCCAGATCAGATTTTGAGGACTCCAGTGTCACACCCTCTCAA 
ATGTCCTGAGACACCAGCCCAACCAGACAGCAGGAGCAAGCTGCTGCCCAGTGACAGCCCCTCTACT 
CCCAAAACCATGCTGAGCCGGTTGGTGATTTCTCCAAC^GGGAAGCTTCCTTCCAGAGGCCCTAAGC 
ATTTGAAGCTCACACCTGCTCCCCTCAAGGATGAGATGACCTCATTGGCTCTGGTCAATATTAATCC 
CTTCACTCCAGAGTCCTATAAAAAATTATTTCTTCAATCTGGTGGCAAGAGGAAAATAAGAAGATGT 
GTTTTACGAGAAACCAACATGGCTTCCCGCTATGAAAAAGAATTCTTGGAGGTTGAAAAAATTGGGG 
TTGGCGAATTTGGTACAGTCTACAAGTGCATTAAGAGGCTGGATGGATGTGTTTATGCAATAAAGCG 
CTCTATGAAAACTTTTACAGAATTATCAAATGAGAATTCGGCTTTGCATGAAGTTTATGCTCACGCA 
GTGCTTGGGCATCACCCCCATGTGGTACGTTACTATTCCTCATGGGCAGAAGATGACCACATGATCA 
TTCAGAATGAATACTGCAATGGTGGGAGTTTGCAAGCTGCTATATCTGAAAACACTAAGTCTGGCAA 
TCATTTTGAAGAGCCAAAACTCAAGGACATCCTTCTACAGATTTCCCTTGGCCTTAATTACATCCAC 
AACTCTAGCATGGTACACCTGGACATCAAACCTAGTAATATATTCATTTGTCACAAGATGCAAAGTG 
AATCCTCTGGAGTCATAGAAGAAGTTGAAAATGAAGCTGATTGGTTTCTCTCTGCCAATGTGATGTA 
TAAAATTGGTGACCTGGGCCACGCAACATCAATAAAC 
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TTCCTCGCTAM 

GATTAACAATTGCAGTGGCTGCAGGAGCAGAGTCATTGCCCACCAATGGTGCTGCATGGCACCATAT 
CCGCAAGGGTAACTTTCCGGACGTTCCTCAGGAGCTCTCAGAAAGCTTTTCCAGTCTGCTCAAGAAC 
ATGATC CAAC CTGATGCCGAACAGAGACC TTC TGC AGCAGCTCTGGCCAGAAATACAGTTC TCCGGC 
CTTCCCTGGGAAAAACAGAAGAGCTCCAACAGCAGCTGAATTTGGAAAAGTTCAAGACTGCCACACT 
GGAAAGGGAACTGAGAGAAGCCCAGCAGGCCCAGTCACCCCAGGGATATACCCATCATGGTGACACT 
GGGGTCTCTGGGACCGACACAGGATCAAGAAGCACAAAACGCCTGGTGGGAGGAAAGAGTGGAAGGT 
CTTCAAGCTTTACCTGTGAGTA ATCTTCCCCTTAAGAACTCATTTTGCAGCCGGGCGTGGTGGCTCA 
CGCCTGTAATCCCAACACTTTGGGAGGCCAAGGCAGGTGGATCATGAGGTCAGGAGATCGAAACCAT 
CCTGGCTAACACGGTGAAACCCCATCTCTACTAAAAATACAAAAAATTAGCAGGGCGAGGTGGCAGG 
CGCCTATAATCCCAGC TACTCAGGAGGCTG AGG AAGG AGAATCGC TTGAACCCGGGAGGTGGAGCTT 
GCAGTGAGCTGAGATCACAC^^ 

ORF Start: ATGat 201 ] ]pRFStop: TAA at 1830 





SEQ ID NO: 90 \543 aa jMW at 60514.5kD 


NOV17a, 
CG145722-01 
Protein Sequence 


MDDKDIDKELRQKLNFSYCEETEIEGQKK^ 

TSSEKDKESPDQILRTPVSHPLKCPETPAQPDSRSKLLPSDSPSTPKTMLSKLVISPTGKLPSRGPK 

HLKLTPAPLKDEOTSIiALVNINPFTPESYKKLFLQ 

VGEFGTVYKCIKRLDGCVYAIKRSMKTFTELSN^ 

IQNEYCNGGSLQAAI SENTKSGNHFEEPKLKDILLQI SLGLNYIHNSSMVHLDIKPSNIFICHKMQS 

ESSGVTEEVENEADWFLSANVMYKIGDLGHATSIKKPKVEEGDSI^ 

GLTIAVAAGAESLPTNGAAWHHIRKGNFPDVPQELSESFSSIJ^^^ 

PSLGKTEEIiQQQLNLEKFKTATLERELREAQQAQSPQGYTHHGDTGVSGTHTGSRSTKRLV 
SSSFTCE 



Further analysis of the NOV17a protein yielded the following properties shown in 
Table 17B. 



Table 17B. Protein Sequence Properties NOV17a 


PSort analysis: 


0.4500 probability located in cytoplasm; 0.3000 probability located in 
microbody (peroxisome); 0.1000 probability located in mitochondrial matrix 
space; 0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV17a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 17C. 



Table 17C. Geneseq Results for NOV17a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV17a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 
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AAB62519 


Xenopus Weel protein 
catalytic domain (residues 
210-443) - Xenopus sp, 240 
aa. [US6225101-B1, 
01-MAY-2001] 


188..431 15 
1..240 


191/244 (77%) 




AAY51401 


Xenopus sp. Weel catalytic 
domain protein fragment - 
Xenopus sp, 240 aa. 
[US6020194-A, 
01-FEB-2000] 


1 88..43 1 
1..240 


170/244 (69%) 
191/244 (77%) 


le-94 


ABB60693 


Drosophila melanogaster 
polypeptide SEQ ID NO 
8871 - Drosophila 
melanogaster, 609 aa. 
[WO200171042-A2, 
27-SEP-2001] 


109..501 
101.-551 


180/464 (38%) 
257/464 (54%) 


9e-78 


AAY96776 


Z. mays partial weel kinase - 
Zea mays, 525 aa. 
[WO200037645-A2, 
29-JUN-2000] 


185..465 
264..513 


103/282 (36%) 
153/282 (53%) 


3e-45 


AAY96770 


Z. mays partial weel kinase - 
Zea mays, 403 aa. 
[WO200037645-A2, 
29-JUN-2000] 


185..465 
142..391 


103/282 (36%) 
153/282 (53%) 


3e-45 



In a BLAST search of public sequence datbases, the NOV17a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 17D. 



Table 17D. Public BLASTP Results for NOV17a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV17a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


095017 


WUGSC:H_DJ0894A10.2 
protein - Homo sapiens 
(Human), 541 aa (fragment). 


1..541 
1..541 


541/541 (100%) 
541/541 (100%) 


0.0 


P47817 


Weel-like protein kinase (EC 
2.7.1.112) - Xenopus laevis 
(African clawed frog), 555 
aa. 


10..542 
11..552 


291/560(51%) 
352/560 (61%) 


e-143 


057473 


Weel homolog - Xenopus 
laevis (African clawed frog), 
554 aa. 


10..542 
11..551 


294/566 (51%) 
357/566(62%) 


e-143 



168 



to 

WO 03/029424 



PCT/US02/31373 



Q8QGV2 


WeelB kinase - Xenopus 
laevis (African clawed frog), 
595 aa. 




10..541 1 
20..593 


263/579 (43%) 
350/579 (60%) 


; ""3 dL 
eT2T 


Q63802 


Weel-Iike protein kinase (EC 
2.7.1.112) -Rattus 
norvegicus (Rat), 646 aa. 


92..54I 
168..644 


236/484 (48%) 
308/484 (62%) 


e-118 



PFam analysis predicts that the NOV17a protein contains the domains shown in the 
Table 17E. 

5 



Table 17E. Domain Analysis of NOV17a 


Pfam Domain 


NOV17a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


pkinase 


194..462 


73/310(24%) 
193/310(62%) 


6.4e-45 



Example 18. 

10 The NOV18 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 18 A. 



Table 18A. NOV18 Sequence Analysis 




SEQIDNO: 91 


|753bp | 


NOV18a, 
CG145754-01 
DNA Sequence 


TCCCTTCTCCTGCCCCTGCAGATCCTACTGCTATCCOTAGCCTTGGAAACTGCAGGAGAAGAAGCCC 
AGGGTGACAAGATTATTGATGGCGCCCCATGTGCAAGAGGCTCCCACCCATGGCAGGTGGCCCTGCT 
CAGTGGCAATCAGCTCCACTGCGGAGGCGTCCTGGTCAATGAGCGCTGGGTC 

TGCAAGATGAATGAGTACACCGTGCACCTGGGCAGTGATACGCTGGGCGACAGGAGAGCTCAGAGGA 
TC^GGCCTCG^GTCATTCCGCCACCCCGGCTACTCCACACAGACCCATGTTAATGACCTCATGCT 
CGTGAAGCTCAATAGCCAGGCCAGGCTGTCATCCATGGTGAAGAAAGTCAGGCTGCCCT.CCCGCTGC 
GAACCCCCTGGAACCACCTGTACTGTCTCCGGCTGGGGCACTACCACGAGCCCAGATGTGACCTTTC 
CCTCTGACCTCATGTGCGTGGATGTCAAGCTCATC^ 

CTTACTGGAAAATTCCATGCTGTGCGCTGGCATCCCCGACTCCAAGAAAAACGCCTGCAATGGTGAC 
TCAGGGGGACCGTTGGTGTGCAGAGGTACCCTGCZAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCG 
GCCAACCCAATGACCGAGGAGTCTACACTCAAGTGTGCAAGTTCACCAAGTGGATAA^ 
GAAAAAGCATCGCTAA 




ORF Start at 1 


|ORFStop:TAAat751 



15 





SEQIDNO: 92 |250aa |MW at 27166.0kD 


NOV18a, 
CG145754-01 


SLLLPLQILLLSLALETAGEEAQGDKIIDGAPCARGSHPWQVALLSGNQLHCGGVLVNERWVLTAAH 

CKMNEYTVHLGSOTI^DRRAQRIKASKSFRHPGYSTQTHViroLMLVKLKSQA^ 

EPPGTTCOTSGWGTTTSPDVTFPSDLMCVDVKLISPODCTKVYKDLLENSMI.CAGIPDSKKNACNGD 
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SEQIDNO: 93 


|862bp | 


N0V18b, 
CG145754-03 
DNA Sequence 


ACTGGGTCCGAATCAGTAGGTGACCCCGCCCCTGGATTCTGGAAGACCTCACCATGGGACncrrr'rn 


ACCTCGTGCGGCCAAGACGTGGATGTTCCTGCTCTTACTGGGGGGAGCC0X3GGCAGCCAGGGGTGAC 
AAGATTATTGATGGCGCCCCATGTGCAAGAGGCTCCCACCCATGGCAGGTGGCCCTGCTCAGTGGCA 
ATCAGCTCCACTGCGGAGGCGTCCTGGTCAATGAGCGCTGGGTGCTCACTGCCGCCCACTGCAAGAT 
GAATGAGTACACCGTGCACCTGGGCAGTGATACGC TGGGCGACAGGAGAGC TCAG AGG ATCAAGGCC 
TCGAAGTCATTCCGCCACCCCGGCTACTCCACACAGACCCATGTTAATGACCTCATGCTCGTGAAGC 
TCAATAGCCAGGCCAGGCTGTCATCCATGGTGAAGAAAGTCAGGCTGCCCTCCCGCTGCGAACCCCC 
TGGAACCACCTGTACTGTCTCCGGCTGGGGCACTACCACGAGCCCAGATGTGACCTTTCCCTCTGAC 
CTCATGTGCGTGGATGTCAAGCTCATCTCCCCCCAGGACTGCACGAAGGTTTACAAGGACTTACTGG 
AAAATTCCATGCTGTGCGCTGGCATCCCCGACTCCAAGAAAAACGCCTGCAATGGTGACTCAGGGGG 
ACCGTTGGTGTGCAGAGGTACCCTGCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCGGCCAACCC 
AATGACCCAGGAGTCTACACTCAAGTGTGCAAGTTCACCAAGTGGATAAATGACACCATGAAAAAGC 
ATCGCTAACGCCACACTGAGTTAATTAACTGTGTGCTTCCAACAGAAAATGCACAGGA 




ORFStart:ATGat54 


|ORFStop:TAAat810 





SEQ ID NO: 94 J252 aa jMW at 27557.6kD 


NOV18b, 
CG145754-03 
Protein Sequence 


MGRPRPRAAKTWMFLLLLGGAWAARGDKIIDGAPCARGSHPWQ 

AHCKMNEYTVHLGSDTLGDRRAQRIKASKSF^ 

RCEPPGTTCWSGWGTTTSPDOTFPSDLMCVDV^ 

GDSGGPLVCRGTLQGLVSWGTFPCGQPOTPGVYTQVCKFTKWIl^TMKKHR 





SEQ ID NO: 95 804 bp j 


NOV18c, 
CG145754-02 . 
DNA Sequence 


GGATTTCCGGGCTCCATGGCAAGATCCCTTCT 

TGGAAACTGCAGG AGAAGAAGCC C AGGGTGAC AAGATTAT TGATGGCGCCCCATGTGC AAGAGGCTC 
CCACCCATGGCAGGTGGCCCTGCTCAGTGGCAATCAGCTCCACTGCGGAGGCGTCCTGGTCAATGAG 
CGCTGGGTGCTCACTGCCGCCCACTGCAAGATGAATGAGTACACCGTGCACCTGGGCAGTGATACGC 
TGGGCGACAGGAGAGCTCAGAGGATCAAGGCCTCGAAGTCATTCCGCCACCCCGGCTACTCCACACA 
GACCXrATGTTAATGACCTCAAGCTCATCTCCCCCCAGGACTGCACGAAGGTTTACAAGGACTTACTG 
GAAAATTCCATGCTGTGCGCTGGCATCCCCGACTCCAAGAAAAACGCCTGCAATGGTGACTCAGGGG 
GACCGTTGGTGTGCAGAGGTACCCTGCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCGGCCAACC 
CAATGACCCAGGAGTCTACACTCAAGTGTGCAAGTTCACCAAGTGGATAAATGACACCATGAAAAAG 
CATCGCTAACGCCACACTGAGTTAATTAACTGTGTGCTTCCAACAGAAAATGCACAGGAGTGAGGAC 


GCCGATGACCTATGAAGTCAAATTTGACTTTACCTTTCCTCAAAGATATATTTAAACCTCATGCCCT 


GTTGATAAACCAATCAAATTGGTAAAGACCTAAAACCAAAACAAATAAAGAAACACAAAACCCTCAA 




ORF Start: ATG at 16 JoRF Stop: TAA at 610 





SEQ ID NO: 96 198 aa MW at 21613.6kD 


NOV18c, 
CG145754-02 
Protein Sequence 


MARSLLLPLQILLLSLALETAGEEAQGDKIIDGAPC^ 

AAHCKMNEYTVHLGSDTLGDRRAQRIKASKSFRHPGYSTQ 

CAGIPDSKKNAraGDSGGPLVCRGTLQGLVSWGTFPTO^^ 



WO 03/029424 

Protein Sequence Jsggpiatc^^ IL 
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SEQIDNO:97 |544bp J 


NOV18d, 
252718128 DNA 
Sequence 


CACCGGATCCGAAGAAGCCCAGGGTGACAAGATTATTGATGGCGCCCCATGTGCAAGAGGCTCCCAC 

CCATGGCAGGTGGCCCTGCTCAGTGGCAATCAGCTCCACTGCGGAGGCGTCCTGGTCAATGAGCGCT 

GGGTGCTCACTGCCGCCCACTGCAAGATGAATGAGTACACCGTGCACCTGGGCAGTGATACGCTGGG 

CGACAGGAGAGCTCAGAGGATCAAGGCCTCGAAGTCATTCCGCCACCCCGGCTACTCCACACAGACC 

CATGTTAATGACCTCAAGC TC ATCTCCCCCCAGGACTGCACGAAGGTT TAC AAGGAC T TACTGGAAA 

ATTCCATGCTGTGCGCTGGCATCCCCGACTCC^UIGAAAAACGCCTGCAATGGTGACTCA 

GTTGGTGTGCAGAGGTACCCTGCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCGGCCAACCCAAT 

GACCCAGGAGTCTACACTCAAGTGTGCAAGTTCACCAAGTGGATAAATGACACCATGAAAAAGCATC 

TCGAGGGC 




ORF Start: at 2 


ORF Stop: end of sequence 





SEQIDNO: 98 


181 aa |MWatl9683.2kD 


NOV18d, 
252718128 
Protein Sequence 


TGSEEAQGDKIIIX^PCARGSHPWQVALLSGNQLHCGGVLVNE 

DRRAQRI KASKSFRHPGYS TQTHVNDLKL I S PQDCTKVYKDLLENSMLCAG I PDSKKNACNGDSGG P 
LVC^GTLQGLVSWGTFPCGQPNDPGVYTQVC^TKWINiyimKHLEG 





SEQIDNO: 99 |292bp j 


NOV18e, 
252718152 DNA 
Sequence 


CACCGGATCCGAAGAAGCCCAGGGTGACAAGATTATTGATGGCGCCCCATGTGCAAGAGGCTCCCAC 
CCATGGCAGGTGGCCCTGCTCAGTGGCAATCAGCTCCACTGCGGAGGCGTCCTGGTCAATGAGCGCT 
GGGTGCTCACTGCCGCCCAC TGCAAGATGAATGAGTACACCGTGC ACC TGGGC AGTGATACGC TGGG 
CGACAGGAGAGCTCAGAGGATCAAGGCCTCGAAGTCATTCCGCCACCCCGGCTACTCCACACAGACC 
CATGTTAATGACCTCCTCGAGGGC 




ORF Start: at 2 jORF Stop: end of sequence 





SEQIDNO: 100 


|97aa |MW at 10551.7kD 


NOV18e, 
252718152 
Protein Sequence 


TGSEEAQGDKIIDGAPCARGSHPWQVALLSGNQLHCGGVLTO 
DRRAQRIKASKSFRHPGYSTQTHVNDLLEG 





SEQIDNO: 101 


742 bp | 


NOVlSf, 
247856668 DNA 
Sequence 


AGGCTCCGCGGGCGCCCCCTTCACCGGATCCGCCAGGGGTGACAAGATTATTGATGGCGCCCCATGT 
GCAAGAGGCTCCCACCCATGGCAGGTGGCCCTGCTCAGTGGCAATCAGCTCCACTGCGGAGGCGTCC 
TGGTCAATGAGCGCTGGGTGCTCACTGCCGCCCACTGCAAGATGAATGAGTACACCGTGCACCTGGG 
CAGTGLATACGCTGGGCGACAGGAGAGCTCAGAGGATCAAGGCCTCGAAGTCATTCCGCCACCCCGGC 
TACTCCACACAGACCCATGTTAATGACCTCATGCTCGTGAAGCTCAATAGCCAGGCCAGGCTGTCAT 
CCATGGTGAAGAAAGTCAGGCTGCCCTCCCGCTGCGAACCCCCTGGAACCACCTGTACTGTCTCCGG 
CTGGGGCACTACCACGAGCCCAGATGTGACCTTTCCCTCTGACCTCATGTGCGTGGATGTCAAGCTC 
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ATCTCCCCCCAGGACTGCACGAAGGT^ 

TCCCCGACTCCAAGAAAAACGCCTGCAATGGTGACTCAGGGGGACCGTTGGTGTGCAGAGGTACCCT 
GCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCGGCCAACCCAATGACCCAGGAGTCTACACTCAA 
GTGTGCAAGTTCACCAAGTGGATAAATGACACCATGAAAAAGCATCGCCTCGAGGGCAAGGGTGGGC 
GCGCC 

ORF Start: at 2 jORF Stop: end of sequence 





SEQIDNO:102 j247aa j 


MWat26591.2kD 


NOV18f, 
247856668 
Protein Sequence 


G S AAAP F TG S ARGDK I IDG AP C ARG SH PWQ VALL SGNQLH CGGVL VNERWVLTAMIC KMNE YTVHLG 
SDTLGDRRAQRIKASKSFRHPGYSTQTHVOTL^ 

WGTTTS PDVTFPSDLMCVDVKL I S PQDCTKVYKDLLENSMLCAGI PDSKKNACNGDS GGPLVCRGTL 
QGL VS WGTF PCGQP1TOPGVYTQVCKFTKW INDTMKKHRLEGKGGRA 





SEQIDNO:103 j673bp 




NOV18g, 
247856705 DNA 
Sequence 


AGGCTCCGCGGCCGCCCCCTTCACCGGATCCGCCAGGGGTGACAAGATTATTGATGGCGCCCCATGT 

GCAAGAGGCTCCCACCCATGGCAGGTGGCCCTGCTCAGTGGCAATCAGCTCCACTGCGGAGGCGTCC 

TGGTCAATGAGCGCTGGGTGCTCACTGCCGCCCACTGCAAGATGAATGAGTACACCGTGCACCTGGG 

CAGTGATACGCTGGGCGACAGGAGAGCTCAGAGGATCAAGGCCTCGAAGTCATTCCGCCACCCCGGC 

TACTCCACACAGACCCATGTTAATGACCTCATGCTCGTGAAGCTCAATAGCCAGGCCAGGCTGTCAT 

CCATGGTGAAGAAAGTCAGGCTGCCCTCCCGCTGCGAACCCCCTGGAACCACCTGTACTGTCTCCGG 

CTGGGGCACTACCACGAGCCCAGATGTGACCTTTCCCTCTGACCTCATGTGCGTGGATGTCAAGCTC 

ATCTCCCCCCAGGACTGCACGAAGGTTTACAAGGACTTACTGGAAAATTCCATGCTGTGCGCTGGCA 

TCCCCGACTCCAAGAAAAACGCCTGCAATX^TGACTCAGGGGGACCGTTGGTGTGCAGAGGTACCC 

GCAAGGTCTGGTGTCCTGGGGAACTTTCCCTTGCGGCCAACCCAATCTCGAGGGCAAGGGTGGGCGC 

GCC 




ORF Start: at 2 |ORF Stop: end of sequence 





SEQ ED NO: 104 |224aa 


MWat23813.0kD 


NOV18g, 
247856705 
Protein Sequence 


GSAAAPFTGSARGDKIIDGAPCARGSHPWQVALLSGNQLHra 
SDTLGDRRAQRIKASKSFRHPGYSTQTHVNDLM^ 
WGTTTSPDVTFPSDLMCVDVKLISPQKTKV^ 
QGLVSWGTFPCGQPNLEGKGGRA 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 18B. 



Table 18B. Comparison of NOV18a against NOV18b through NOV18g. 


Protein Sequence 


NOV18a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV18b 


25..250 
27..252 


213/226(94%) 
213/226 (94%) 
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NOV18c 


16..250 






19..198 


177/235 (74%) 


NOV18d 


17..249 


172/233 (73%) 




1..178 


173/233 (73%) 


NOV18e 


17..111 


92/95 (96%) 




1..95 


yj/yD \yj70) 


NOV18f 


22..250 


215/229 (93%) 




11.239 


216/229 (93%) 


NOV18g 


22..230 


193/209 (92%) 




11..219 


194/209(92%) 



Further analysis of the NOV18a protein yielded the following properties shown i 
Table 18C. 



Table 18C. Protein Sequence Properties NOV18a 


PSort analysis: 


0.6233 probability located in outside; 0.1000 probability located in 
endoplasmic reticulum (membrane); 0. 1000 probability located in 
endoplasmic reticulum (lumen); 0.1000 probability located in microbody 
(peroxisome) 


SignalP analysis: 


Cleavage site between residues 20 and 21 



A search of the NOV18a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 18D. 



Table 18D. Gent 


sseq Results for NOV18a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV18a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAU82740 


Amino acid sequence of 
novel human protease #39 - 
Homo sapiens, 253 aa. 
[WO200200860-A2, 
03-JAN-2002] 


1..250 
4..253 


250/250(100%) 
250/250(100%) 


e-150 
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WO 03/029424 



PCTYUS02/31373 



AAW05383 


Human amyloid precursor 
protein protease - Homo 
sapiens, 253 aa. 
[W09631122-A1, 
10-OCT-1996] 


1..250 fr 
4.253 


250/250 (100%) 




AAK67888 


Human stratum corneum 
chymotrophic recombinant 
enzyme (SCCE) - Homo 
sapiens, 253 aa. 
[WO9500651-A, 
05-JAN-1995] 


1..250 
4..253 


250/250 (100%) 
250/250 (100%) 


e-150 


AAB21326 


Human HSCEE - Homo 
sapiens, 257 aa. 
[WO200053776-A2, 
14-SEP-2000] 


1..250 
4.257 


249/255 (97%) 
249/255 (97%) 


e-146 


AAB98502 


Human Stratum Corneum 
Chymotryptic Enzyme, 
SCCE, catalytic domain - 
Homo sapiens, 225 aa. 
[WO200129056-A1, 
26-APR-2001] 


^o..z^u 
1..225 


225/225 (100%) 
225/225 (100%) 


e-136 



In a BLAST search of public sequence datbases, the NOV18a protein was found 
have homology to the proteins shown in the BLASTP data in Table 18E. 



Table 18E.Pub 


lie BLASTP Results for NOV18a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV18a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


P49862 


Kallikrein 7 precursor (EC 
3.4.2L-) (Stratum corneum 
chymotryptic enzyme) 
(hSCCE) - Homo sapiens 
(Human), 253 aa. 


1..250 
4..253 


250/250(100%) 
250/250 (100%) 


e-149 


AAH32005 


Kallikrein 7 (chymotryptic, 
stratum corneum) - Homo 
sapiens (Human), 253 aa. 


1..250 
4..253 


249/250 (99%) 
249/250 (99%) 


e-148 


Q91VE3 


Thymopsin (Stratum 
corneum chymotryptic 
enzyme) - Mus musculus 
(Mouse), 249 aa. 


3..250 
5..249 


185/248 (74%) 
212/248 (84%) 


e-111 



174 



WO 03/029424 



PCT/US02/31373 



AAN03663 


Kallikrein 7 short variant 
protein - Homo sapiens 
(Human), 181 aa. 


p 

70..250 r 

1 tot 

1..181 


I8I/I8I (100%) 


e~Jfa7^"" feA ' 


Q9R048 


Stratum corneum 
chymotryptic enzyme - Mus 
musculus (Mouse), 234 aa 
(fragment). 


3..235 
5..234 


175/233 (75%) 
198/233 (84%) 


e-102 



PFam analysis predicts that the NOV18a protein contains the domains shown in the 
Table 18F. 



Table 18F. Domain Analysis of NOV18a 


Pfam Domain 


NOV18a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


trypsin 


27.-242 


93/262(35%) 
182/262(69%) 


3.8e-87 



Example 19. 

The NOV19 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 19A. 



Table 19A. NOV19 Sequence Analysis 




SEQIDNO:105 |2028bp | 


NOV19a, 
CG146279-01 
DNA Sequence 


TTGAGGACTTTATTATTATTTGGGTTCTTTTCATTTCTTCCCCTTCTGGGCAACGAAGCAATGA^T 


TTCCAATCGAGACGCCAAGAAAACAGGTGAACTGGGATCCTAAAGTGGCCGTTCCCGCAGCAGCACC 
GGTGTGCCAGCCCAAGAGCGCCACTAACGGGCAACCCCCGGCTCCGGCTCCGACTCCAACTCCGCGC 
CTGTCCATTTCCTCCCGAGCCACAGTGGTAGCCAGGATGGAAGGCACCTCCCAAGGGGGCTTGCAGA 
CCGTCATGAAGTGGAAGACGGTGGTTGCCATCTTTGTGGTTGTGGTGGTCTACCTTGTCACTGGCGG 




GCGGAATTCCTGCGGGATCATGTCTGTGTGAGCCCCCAGGAGCTGGAGACGTTGATCCAGCATGCTC 
TTGATGCTGACAATGCGGGAGTCAGTCCAATAGGAAACTCTTCCAACAACAGCAGCCACTGGGACCT 
CGGCAGTGCCTTTTTCTTTGCTGGAACTGTCATTACGACCATAGGGTATGGGAATATTGCTCCGAGC 
ACTGAAGGAGGCAAAATCTTTTGTATTTTATATGCCATCTTTOT 

TGGCTGGAATTGGAGACCAACTTGGAACCATCTTTGGGAAAAGCATTGCAAGAGTGGAGAAGGTCTT 
TCGAAAAAAGCAAGTGAGTC^GACCAAGATCCGGGTC^TCTCiVACCATCCTGTTCATCTTGGCCGGC 
TGCATTGTGTTTGTGACGATCCCTGCTGTCATCTTTAAGTACATCGAGGGCTGGACGGCCTTGGAGT 
CCATTTACTTTGTGGTGGTCACTCTGACCACGGTGGGCTTTGGTGATTTTGTGGCAGGGGGAAACGC 
TGGCATC^TTATCGGGAGTGGTATAAGCCCCTAGTGTCSGTTTTGGATCCTTGTTGGCCTTGCCTAC 
TTTGCAGCTGTCCTCAGTATGATCX3GAGATTGGCTACGGGTTCTGTCCAAAAAGACAAAAGAAGAGG 
TGGGTGAAATCAAGGCCCATGCGGCAGAGTGGAAGGCCAATG 

GCGAAGGCTCAGCGTGGAGATCCACGATAAGCTGCAGCGGGCGGCCACCATCCGCAGCATGGAGCGC 
CGGCGGCTGGGCCTGGACCAGCGGGCCCACTCACTGGACATGCTGTCCCCCGAGAAGCGCTCTGTCT 
TTGCTGCCCTGGACACCGGCCGCTTCAAGGCCTCATCCCAGGAGAGCATCAACAACCGGCCCAACAA 
CCTGCGCCTGAAGGGGCCGGAGCAGCTGAAC^GCATGGGCAGGGTGCGTCCGAGGACAACATCATC 
AACAAGTTCGGGTCCACCTCCAGACTCACCAAGAGGAAAAACAAGGACCTCAAAAAGACCTTGCCCG 
AGGACGTTCAGAAAATCTACAAGACCTTCCGGAATTACTCCCTGGACGAGGAGAAGAAAGAGGAGGA 
GACGGAAAAGATGTGTAACTCAGACAACTCCAGCACAGCCATGCTGACGGACTGTATCCAGCAGCACj 
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WO 03/029424 


JT V> 1/ UoUi/JlJ / J 




GCTGAGTTGGAGAACGGAATGATACCCACGGACACCl^ 

TTGAAGACAGAAACTAAATGTGAAGGACATTGGTCTTGGACTGAGCGTTGTGTGTGTGTGTGTGTGT 
GTTTTTAATATTCACACTGAGACATGTGCCTTAAACAGACTTTTTAGTCCAAAATTACATAGCATTG 




AAGAATATATTTCACTGTGCCATAAACAACTGAAAGCTTGCTCTGCCAAAAGGAATCAGAGAACAAG 




aacttcatttcagatagcaaacgcaggacacaccaagagtgtccgtgcacgtagccggttctggccg 
tacatgttaagggcatttcagtggcagtgctgtacccctgggcagtgctacctgggcacacacgtaH 




ACAAGGGCAGCTATTCCT 




ORF Start: ATG at 61 J ORF Stop: TAA at 1690 





SEQ ID NO: 106 j543 aa MW at 60334.6kD 


NOV19a, 
CG146279-01 
Protein Sequence 


l^fpietprkqvnwdpkv^ 
lqtvmkwktwaifvwwylvtggl^ 

haldadi^gvspignssnnsshtolgsafffagtvittigygniapsteggkifcilyaifgiplfg 

fliagigdqlgtifgksiarvekvfrkkqvsqtkirvistilfilagcivfotipavifkyiegwta 

lesiyfvvvtltotgfgdfvaggnaginyrefc^ 

eevgeikahaaewkaiwtaefret^ 

svfaaldtgrfkassqes innrpnnlrlkgpeqlnkhgqgasedni inkfgstsrltkrknkdlkkt 
lpedvqkiyktfrotsldeekkeeetekmcnsdnsstamltdciqqiiaelengmiptdtkdrepenn 

SLLEDRN 



Further analysis of the NOV19a protein yielded the following properties shown in 
Table 19B. 



Table 19B. Protein Sequence Properties NOV19a 


PSort analysis: 


0.6000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3000 probability located in endoplasmic reticulum (membrane); 
0.3000 probability located in microbody (peroxisome) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV 19a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins, shown in Table 19C. 



Table 19C. Geneseq Results for NOV19a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV19a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 
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WO 03/029424 



PCT/US02/31373 



AAU81354 


Novel human ion channel 
protein #34 - Homo sapiens, 
543 aa. [WO200185788-A2, 
15-NOV-2001] 


1..543 ► 
1..543 


543/543 (100%) 




AAU79472 


Human novel transporter 
protein - Homo sapiens, 543 
aa. [WO200224748-A2, 
28-MAR-2002] 


1..543 
1..543 


543/543 (100%) 
543/543 (100%) 


0.0 


AAU79473 


Human novel transporter 
protein variant - Homo 
sapiens, 543 aa. 
[WO200224748-A2, 
28-MAR-2002] 


1..543 
1..543 


542/543 (99%) 
543/543 (99%) 


0.0 


AAE16596 


Human TWIK-Related K+ 
channel-2 (TREK-2) protein 
- Homo sapiens, 538 aa. 
[WO200200715-A2, 
03-JAN-2002] 


1 8..543 
13..538 


526/526 (100%) 
526/526 (100%) 


0.0 


AAB47930 


Human TREK2 - Homo 
sapiens, 538 aa. 
[WO200200715-A2, 
03-JAN-2002] 


18..543 
13..538 


526/526 (100%) 
526/526 (100%) 


0.0 


In a BLAST search of public sequence datbases, the NOV19a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 19D. 


Table 19D. Public BLASTP Results for NOV19a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV19a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q8TDK7 


Potassium channel TREK2 
splice variant b - Homo 
sapiens (Human), 543 aa. 


1..543 
1..543 


542/543 (99%) 
542/543 (99%) 


0.0 


P57789 


Potassium channel subfamily 
K member 10 (Outward 
rectifying potassium channel 
protein TREK-2) (TREK-2 
K+ channel subunit) - Homo 
sapiens (Human), 538 aa. 


18..543 
13..538 


526/526 (100%) 
526/526 (100%) 


0.0 


Q8TDK8 


Potassium channel TREK2 
splice variant a - Homo 
sapiens (Human), 543 aa. 


18..543 
18..543 


525/526 (99%) 
525/526 (99%) 


0.0 



177 



WO 03/029424 



PCT/US02/31373 



Q9JIS4 


Potassium channel subfamily 
K member 10 (Outward 
rectifying potassium channel 
protein TREK-2) (TREK-2 
K+ channel subunit) - Rattus 
norvegicus (Rat), 538 aa. 


1..543 
1..538 


520/544 (95%) 
529/544 (96%) 


0.0 


P97438 


Potassium channel subfamily 
K member 2 (Outward 
rectifying potassium channel 
protein TREK-1) (Two-pore 
potassium channel TPKC1) 
(TREK-1 K+ channel 
subunit) - Mus musculus 
(Mouse), 411 aa. 


22..404 
2..369 


247/384 ^64%^ 
301/384 (78%) 


C-13D 



PFam analysis predicts that the NOV19a protein contains the domains shown in the 
Table 19E. 

5 



Table 19E. Domain Analysis of NOV19a 


Pfam Domain 


NOV19a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


ion_trans 


158..323 


41/231 (18%) 
119/231(52%) 


0.046 



Example 20. 

10 The NOV20 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 20A. 



Table 20A. NOV20 Sequence Analysis 




SEQ ID NO: 107 j2958 bp j 


NOV20a, 
CG146374-01 
DNA Sequence 


GCTCCTCCCCGCTGGCGGGGGGAGAAAGGGCAGGAGGCCTTCCGTCCCGGCTATA^GGGCCCCGGA 


CCGCCGCGGCTCGCCTCGGCTTGCCTCGACACGCCTAGGCGCCCTCCGGCTCCGCCCTAGCCGCCGC 


GTCCCAGCTAGAGCTCCAGCGCCCGCTCAGGCCCCACTCGACCCTCTCGGGCCTCGGCTACTTGGAC 


TGCGGCGGMTATGGCGGCTCCGATGACTCCCGCGGCTCGGCCCGAGGACTACGAGGCGGCGCTCAA 
TGCCGCCCTGGCTGACGTGCCCGAACTGGCCAGACTCCTGGAGATCGACCCGTACTTGAAGCCCTAC 
GCCGTGGACTTCCAGCGCAGGTATAAGCAGTTTAGCCAAATTTTGAAGAACATTGGAGAAAATGAAG 
GTGGTATTGATAAGTTTTCCAGAGGCTATGAATCATTTGGCGTCCACAGATGTGCTGATGGTGGTTT 
ATAC TGC AAAGAATGGGCCCCGGGAGCAGAAGGAGTTTTTCTTAC TGGAGATTTTAATGGTTGGAAT 
CCATTTTCGTACCCATACAAAAAACTGGATTATGGAAAATGGGAGCTGTATATCCCACCAAAGCAGA 
ATAAATCTGTACTCGTGCCTCATGGATCCAAATTAAAGGTAGTTATTACTAGTAAAAGCGGAGAGAT 
CTTGTATCGTATTTCACCGTGGGCAAAGTATGTGGTTCGTGAAGGTGATAATGTGAATTATGATTGG 
ATACACTGGGATCCAGAACACTCATATGAGTTTAAGCATTCCAGACCAAAGAAGCCACGGAGTCTAA 
GAATTTATGAATCTCATGTGGGAATTTCTTCCCATGAAGGAAAAGTAGCTTCTTATAAACATTTTAC 
ATGCAATGTACTACCAAGAATCAAAGGCCTTGGATACAACTGCATTCAGTTGATGGCAATCATGGAG 
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PCT/US02/31373 



CATGCTTACTATGCCAGCTTTGGTTACCAAATCACife 

C AC CTGAAGAGCTAC AAGAAC TGGTAGAC ACAGCTCATTCCATGGGT ATCATAGTCC TCTTAGATGT 
GGTACACAGCCATGCTTCAAAAAATTCAGCAGATGGATTGAATATGTTTGATGGGACAGATTCCTGT 
TATTTTCATTCTGGACCTAG AGGGACTCATGATCTTTGGGATAGC AGATTGTTTGCC TAC TCCAGGT 
TGAATATTOCAGACATCTA AGCCAATTAGAATCATGATTGTTTTGATTGCCAGAAATCCTTAAATCT 
GGGAAGTTTTAAGATTCCTTCTGTCAAACATAAGATGGTGGTTGGAAGAATATCGCTTTGATGGATT 
TCGTTTTGATGGTGTTACGTCCATGCTTTATCATCACCATGGAGTGGGTCAAGGTTTCTCAGGTGAT 
TACAGTGAATATTTCGGACTACAAGTAGATGAAGATGCCTTGACTTACCTCATGTTGGCAAATCATT 
TGGTTCACACGCTGTGTCCCGATTCTATAACAATAGCTGAGGATGTATCAGGAATGCCAGCTCTGTG 
C TC TC CAATTTCCC AGGGAGGGGGTGGTTTTG ACTATCG ACTAGCC ATGGC AATTCC AGATAAGTGG 
ATTCAGCTACTTAAAGAGTTTAAAGATGAAGACTGGAACATGGGCGATATAGTATACACGCTCACAA 
ACAGGCGCTACCTTGAAAAGTGC ATTGC TTATGCAGAG AGCC ATG ATC AGGC ATTGGT TGGGGAT AA 
GTCGC TGGCATTTTGGTTGATGGATGCCGAAATGTATAC AAACATGAGTGT CC TGAC TCC TTTTACT 
CC AGTTAT TGATCGTGGAATAC AGC TTC ATAAAATG AT TCGAC TC ATT ACGC ATGGG C TTGGTGG AG 
AAGGCTATCTCAATTTCATGGGTAATGAATTTGGGCATCCTGAATGGTTAGACTTCCCAAGAAAAGG 
AAATAATGAGAGTTACCATTATGCCAGGCGGCAGTTTCATTTAACTGACGACGACCTTCTTCGCTAC 
AAGTTCCTAAATAATTTTGACAGGGATATGAATAGATTGGAAGAAAGATATGGTTGGCTTGCAGCTC 
CACAGGCCTACGTGAGTGAAAAACATGAAGGCAATAAGATCATTGCTTTTGAAAGAGCAGGTCTTCT 
TTTCATTTTCAACTTCCATCCAAGCAAGAGCTACACTGACTACCGAGTTGGAACAGCATTGCCAGGG 
AAATTCAAAATTGTGCTAGATTCAGATGCAGCGGAATATGGAGGGCATCAGAGACTGGACCACAGCA 
CTGACTTTTTTTCTGAGGCTTTTGAACATAATGGGCGTCCCTATTCTCTTTTGGTGTACATTCCAAG 
C AGAGTGGCCCTCATCC TTCAGAATGTGG ATCTGCCGAATTGAAGAGGCC TGATTTC AGCTCCACC A 
GATGCAGATTTGTGTTTTGTTTTCTTGTTATCACTGTCACACAGCTTATAACATGTATGCTTTTCAG 
AATACAGTTGTCTAGCCAAGCCATCAAGTGTCTGAAATTCAATATTGGTTTATGCAAATACAGCAAA 
C TTTTATTTAAGTAGATAGGAGAATATGTTTAAAATATTAGGAATCCTAGACC AT AT T TTCAAGTC A 
TCTTAGCAGCTAGGATTCTCAAATGGAAGTGTTATATATAATATGTTAAAAACATTTTGCTTTCCTG 
GCTAATTATTTGATCCTTTTAAATTCAAATTTGAATCATTTGTCATGTATGATTATTTCTGTTAAAT 
GTACACAGTATTTAAGATGGATATTTGGTGGCTCTATTTGTTCTGATATCTTTTGGTCTAAATTATG 
AGGTACCAAGATTGTTTCTTTGTTTCTTTTTTTCAAATTGTGTTTAGAAATACTGTAATAAATATGC 
AGTAGTGATATAAAGAATTATATCCAAGGTAATATAAAAGCCATTACGTATGAACTCAAAAAAAAAA 
AAAAAAAAAA 

' ORF Start: ATG at 213 \ jORF Stop: TAAat ?224 





SEQ ID NO: 108 |337 aa |MW at 38247.8kD 


NOV20a, 
CG146374-01 
Protein Sequence 


MAAPMT PAARPEDYEAALNAALADVP ELARLLE IDPYLKP YAVDFQRR YKQF S Q I LKN IGENEGG I D 

KFSRGYESFGVHRCAIXSGLYCKEWAPGAEGVFLTGDF^ 

LVPHGSKLKWITSKSGEILYRISPWAKYVVREGDlsrVISrroWIHWDP 

SHVGI SSHEGKVAS YKHFTCNVLPRIKGLGYNC IQLMAIMEHAYYASFGYQ ITSFFAASSRYGSPEE 

LQELVDTAHSMGIIVLLDVVHSHASKNSADGLNMFDGTDSCYFHSGPRGTHDLWDS 

DI 



Further analysis of the NOV20a protein yielded the following properties shown in 
Table 20B. 



Table 20B. Protein Sequence Properties NOV20a 


PSort analysis: 


0.7480 probability located in microbody (peroxisome); 0.6000 probability 
located in nucleus; 0.1000 probability located in mitochondrial matrix space; 
0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 
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A search of the NOV20a protein against the GeneTe^dfcafcaWa fjMphetaSfy 1 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 20C. 



Table 20C. Geneseq Results for NOV20a 


Identifier 


T'i~rbtf*in/f)T*f7QTiicm/Y .011 trtli 

[Patent*, Date] 


NOV20a 

Rpcirliipc/ 

Match 
Residues 


Identities/ 

Similfifif IPC fni* 

the Matched 
Region 


Value 


AAB908O3 


Human shear stress-response 
protein SEQ ID NO: 106 - 

Hnmr* cqhiptic TOO 51 O 
nUlllU octpiCilo, / v/Zr da. 

[WO200125427-A1, 
12-APR-2001] 


1..330 
1..330 


328/330 (99%) 
329/330(99%) 


0.0 


ABB60350 


Drosophila melanogaster 
polypeptide SEQ ID NO 

/ ©*+^ - x/x\JoVyiUlcL 

melanogaster, 865 aa. 

[WO200171042-A2, 

27-SEP-2001] 


22..329 
1..314 


170/314 (54%) 
227/314 (72%) 


e-102 


AAB49603 


Glycogen branching enzyme 
amino acid sequence - 
Aspergillus nidulans, 686 aa. 
[JP2000279180-A, 
10-OCT-2000] 


31. .329 
12..314 


175/305 (57%) 
228/305 (74%) 


le-98 


AAG39093 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 48322 
- Arabidopsis thaliana, 721 
aa. [EP1033405-A2, 
06-SEP-2000) 


30..329 
22..321 


161/302 (53%) 
214/302 (70%) 


3e-92 


AAG39092 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 48321 
- Arabidopsis thaliana, 858 
aa. [EP1033405-A2, 
06-SEP-2000] 


30..329 
159..458 


161/302 (53%) 
214/302 (70%) 


3e-92 



In a BLAST search of public sequence datbases, the NOV20a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 20D. 



Table 20D. Public BLASTP Results for NOV20a 
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PCT/US02/31373 



Protein 

Accession 

Number 


Protein/Organism/Length 


NOV20a 
Residues/ 
Match 
Residues 


c> ii / u ■» uc 

Identities/ 
Similarities for the 
Matched Portion 


jjh jl jj» y „ - 

Expect 
Value 


Q96EN0 


Similar to glucan 
(1 4-alt>ha-} branching 
enzyme 1 (glycogen 
branching enzyme, Andersen 
disease, glycogen storage 
disease type IV) - Homo 
sapiens (Human), 702 aa. 


1..330 

X ..JJU 


330/330(100%) 

DD\Jf D3\J \ X \JU fo ) 


0.0 


Q04446 


1 4-alDha-{?Iucan branrhinix 
enzyme (EC 2.4.1.18) 
(Glycogen branching 
enzyme) (Brancher enzyme) 
- Homo sapiens (Human), 
702 aa. 


1 ^0 

X *.JJ\J 

1..330 


j^o/jju yyy /o) 
329/330(99%) 


u.u 


Q9D6Y9 


23 10045H19Rik protein 
(RIKEN cDNA 2310045H19 
gene) - Mus musculus 
(Mouse), 702 aa. 


1..330 
1..330 


291/330(88%) 
310/330(93%) 


e-179 


AAF58416 


CG4023-PA - Drosophila 
melanogaster (Fruit fly), 685 
aa. 


22..329 
1..314 


170/314(54%) 
227/314(72%) 


e-102 


Q9V6K7 


CG4023 protein - Drosophila 
melanogaster (Fruit fly), 865 
aa. 


22..329 
1..314 


170/314(54%) 
227/314(72%) 


e-102 



PFam analysis predicts that the NOV20a protein contains the domains shown in the 
Table 20E. 



Table 20E. Domain Analysis of NO V20a 


Pf am Domain 


NOV20a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


isoamylaseJM 


73..168 


31/123(25%) | 
64/123 (52%) 


5.1e-ll 
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Example 21. 

i 

The NOV21 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 21A. 



Table 21A. NOV21 Sequence Analysis 




SEQ ID NO: 109 j885bp 




NOV21a, 
CG146403-01 
DNA Sequence 


TGGATGCTGGCGGTCCTCTACCTGGTCTGGCTCTATTGGGATAGAAACATACCCAGGGCTGGTGGAA 
GGC GTTCGGAGTGGATAAGGAACCGGGC AATTTGG AG AC AACTAAGGGATTATTATC CTGTC AAGCT 
GGTGAAAACAGCAGAGCTGCCCCCGGATCGGAACTACGTGCTGGGCGCCCACCCTCATGGGATCATG 
TGTACAGGCTTCCTCTGTAATTTCTCCACCGAGAGCAATGGCTTCTCCCAGCTCTTCCCGGGGCTCC 
GGCCCTGGTTAGCCGTGCTGGCTGGCCTCTTCTACCTCCCGGTCTATCGCGACTACATCATGTCCTT 
TGGTCTCTGTCCGGTGAGCCGCCAGAGCCTGGACTTCATCCTGTCCCAGCCCCAGCTCGGGCAGGCC 
GTGGTCATCATGGTGGGGGGTGCGCACGAGGCCCTGTATTCAGTCCCCGGGGAGCACTGCCTTACGC 
TCCAGAAGCGCAAAGGCTTCGTGCGCCTGGCGCTGAGGCACGGGGCGTCCCTGGTGCCCGTGTACTC 
CTTTGGGGAGAATGACATCTTTAGACTTAAGGCTTTTGCCAC AGGCTC C TGGC AGC ATTGGTGCCAG 
CTCACCTTCAAGAAGCTCATGGGCTTCTCTCCTTGCATCTTCTGGGGTCGCGGTCTCTTCTCAGCCA 
CCTCCTGGGGCCTGCTGCCCTTTGCTGTGCCCATCACCACTGTGGGTGAGCCCATCCCCGTCCCCCA 
GCGCCTCCACCCCACCGAGGAGGAAGTCAATCACTATCACGCCCTCTACATGACGGCCCTGGAGCAG 
C TC TTCGAGGAGCACAAGGAAAGCTGTGGGGTCC CCGCTTCCACCTGCCTCAC CTTCATCTAGGCCT 
GGCCGCGGCCTTTC 




ORF Start: ATG at 4 JORF Stop: TAG at 865 





SEQ ID NO: 110 |287 aa jMW at 32641.7kD 


NOV21a, 
CG146403-01 
Protein Sequence 


MLAVLYLVWLYWD^ 

TGFLCNFSTESNGFSQLFPGLRPWLAVLAGLFYLPVYRDYIMSFGLCPVSRQSLDFILSQPQLGQAV 
VIWGGAHEALYSVPGEHCLTLQKRKGFVRIJUjRHGASL^ 

TFKKLMGFSPCIFWGRGLFSATSWGLLPFAVPITOTGEPIPVPQRLHPTEEE^/NHYHALYMTALEQL 
FEEHKESCGVPASTCLTFI 



Further analysis of the NOV21a protein yielded the following properties shown in 
Table 21B. 



Table 21B. Protein Sequence Properties NOV21a 


PSort analysis: 


0.5500 probability located in endoplasmic reticulum (membrane); 0.3814 
probability located in lysosome (lumen); 0.3200 probability located in 
microbody (peroxisome); 0.1000 probability located in endoplasmic reticulum 
(lumen) 


SignalP analysis: 


Cleavage site between residues 22 and 23 



A search of the NOV21a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 21C. 
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Table 21C. Geneseq Results for NOV21a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV21a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAM80262 


Human protein SEQ ID NO 
3908 - Homo sapiens, 223 
aa. [WO200157190-A2, 
09-AUG-2001] 


43..237 
29..223 


195/195 (100%) 
195/195 (100%) 


e-115 


ABB75677 


Breast protein-eukaryotic 
conserved gene 1 
(BSTP-ECG1) protein - 
Homo sapiens, 388 aa. 
[WO200208260-A2, 
31-IAN-2002] 


1..284 
101. .385 


158/285 (55%) 
218/285 (76%) 


le-97 


AAB66170 


Protein of the invention #82 - 
Unidentified, 388 aa. 
[WO200078961-A1, 
28-DEC-2000] 


1..284 
101. .385 


158/285 (55%) 
218/285 (76%) 


le-97 


AAU29191 


Human PRO polypeptide 
sequence #168 - Homo 
sapiens, 388 aa. 
[WO200168848-A2, 
20-SEP-2001] 


1..284 
101..385 


158/285 (55%) 
218/285 (76%) 


le-97 


AAY99421 


Human PR01433 (UNQ738) 
amino acid sequence SEQ ID 
NO:292 - Homo sapiens, 388 
aa. [WO200012708-A2, 
09-MAR-2000] 


1..284 
101..385 


158/285 (55%) 
218/285 (76%) 


le-97 



In a BLAST search of public sequence datbases, the NOV21a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 2 ID. 



Table 21D. Public BLASTP Results for NOV21a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV21a 
Residues/ ; 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9UDW7 


WUGSC:HJ)J0747G18.5 
protein - Homo sapiens 
(Human), 261 aa (fragment). 


43..287 
16..261 


244/246 (99%) 
244/246 (99%) 


e-145 
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CAD38961 


Hypothetical protein - Homo 
sapiens (Human), 434 aa 
(fragment). 


I ir-i 

1..284 

147..431 


158/285 (55%) 
218/285 (76%) 


3e-97 


Q96PD7 


Diacylglycerol 
acyltransferase 2 
(Hypothetical 43.8 kDa 
protein) - Homo sapiens 
(Human), 388 aa. 


1..284 
101..385 


158/285 (55%) 
218/285 (76%) 


3e-97 


Q9BYE5 


GS1999full protein - Homo 
sapiens (Human), 297 aa. 


1..284 
10..294 


158/285 (55%) 
218/285 (76%) 


3e-97 


Q9DCV3 


0610010B06Rik protein 
(Diacylglycerol 
acyltransferase 2) - Mus 
musculus (Mouse), 388 aa. 


1..284 
101..385 


159/285 (55%) 
217/285 (75%) 


8e-97 



Example 22. 

The NOV22 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 22A. 



Table 22A. NOV22 Sequence Analysis 




SEQIDNOilll | 


1135 bp | 


NOV22a, 
CG1465 13-01 
DNA Sequence 


CACAGTAAGAGATTATAGCAAAGCATCTATAATCAACTCAGCTTAAGAAGTTTTGACCTTCTGGTTA 


GGCTTCTTGCCACAACAGAACAGCACCATAACCATGGCTTTCTTCTCCCGACTGAATCTCCAGGAGG 


GCCTCC AAAC C TTCT^TTGTTTTGCAATGGATCCC AGTCTATATATTTTTAGGAGCTATTCCCATTCT 
CCTTATACCCTACTTTCTGTTATTCAGTAAGTTCTGGCCCTTGGCTGTGCTCTCCTTAGCCTGGCTC 
ACCTATGATTGGAACACCCACAGTCAAGGTGGCAGGCGTTCAGCTTGGGTACGAAACTGGACCCTAT 
GGAAGTATTTCCGAAATTACTTCCCAGTACAGCTGGTGAAGACTCATGATCTTTCTCCCAAACACAA 
CTACATCATTGCCAATCACCCCCATGGCATTCTCTCTTTTGGTGTCTTCATCAACTTTGCCACTGAG 
GCC ACTGGC ATTGCTCGG ATTT TC CC ATCCATCACTCCC TTTGTAGGGACC T TAGAAAGGAT ATTTT 
GGATCCCAATTGTGCGAGAATATGTGATGTCAATGGGTGTGTGCCCTGTGAGTAGCTCAGCCTTGAA 
GTACTTGCTGACCCAGAAAGGCTCAGGCAATGCCGTGGTTATTGTGGTGGGTGGAGCTGCTGAAGCT 
CTCTTGTGCCGACCAGGAGCCTCCACTCTCTTCCTCAAGCAGCGTAAAGGTTTTGTGAAGATGGCAC 
TGCAAACAGGGGCATACCTTGTCCCTTCATATTCCTTTGGTGAGAACGAAGTTTTCAATCAGGAGAC 
CTTC CC TGAGGGC ACGTGGTTAAGGTTGTTCCAAAAAACC TTCCAGGAC AC ATTC AAAAAAATCCTG 
GGACTAAATTTCTGTACCTTCCATGGCCGGGGCTTCACTCGCGGATCCTGGGGCTTCCTGCCTTTCA 
ATCGGCCCATTACCACTGTTGGGGAACCCCTTCCAATTCCCAGGATTAAGAGGCCAAACCAGAAGAC 
AGT AGAC AAGTATCACGC AC TCTAC ATCAGTGCCC TGCGC AAGCTCTTTGACCAAC AC AAAGTTGAA 
TATGGCCTCCC TGAGACCC AAGAGCTGAC AAT TAC ATAACAGGAGCGAC ATTCCCCATTGATC 




ORF Start: ATGat 101 


|ORFStop: TAAat 1109 





SEQIDNO: 112 


|336aa 


MWat38493.6kD 


NOV22a, 
CG146513-01 
Protein Sequence 


MAFFSRLNLQEGLQTFFVLQWIPVYIFLGAIPILLIPYF^ 

RRSAWVRNWTLWKYFRNYFPVQLVKTHDLS PKHNYI I ANHPHGILSFGVF INFATEATGIARIFPS I 
T PFVGTLER I FW I P I VRE YVMSMGVC PVSS S ALKYLLTQKGSGNAWI WGGAAEALLCR PGASTLF 
LKQRKGFVKMALQTGAYLVPSYSFGENEVFNQETFPEGTWl^FQKTFQDTFKKILG 
FTRGSWGFLPFNRPITTVGEPLPIPRIKRPNQKTVDKYHALYISALRKLFDQHKVEYGLPETQELTI 
T 
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Further analysis of the NOV22a protein yielded the following properties shown in 
Table 22B. 



Table 22B. Protein Sequence Properties NOV22a 


PSort analysis: 


0.6850 probability located in plasma membrane; 0.6400 probability located in 
endoplasmic reticulum (membrane); 0.3880 probability located in microbody 
(peroxisome); 0.3700 probability located in Golgi body 


SignalP analysis: 


Cleavage site between residues 65 and 66 



A search of the NOV22a protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 22C. 



Table 22C. Geneseq Results for NOV22a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV22a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAM06866 


Human foetal protein, SEQ 
ID NO: 1074 -Homo 
sapiens, 225 aa. 
[WO200155339-A2, 
02-AUG-2001] 


1..216 
1..216 


211/216(97%) 
214/216 (98%) 


e-124 


ABB75677 


Breast protein-eukaryotic 
conserved gene 1 
(BSTP-ECG1) protein - 
Homo sapiens, 388 aa. 
[WO200208260-A2, 
31-JAN-2002] 


1..335 
56..387 


171/337 (50%) 
237/337 (69%) 


e-101 


AAB66170 


Protein of the invention #82 - 
Unidentified, 388 aa. 
[WO200078961-A1, 
28-DEC-2000] 


1..335 
56..387 


171/337 (50%) 
237/337 (69%) 


e-101 


AAU29191 


Human PRO polypeptide 
sequence #168 - Homo 
sapiens, 388 aa. 
[WO200168848-A2, 
20-SEP-2001] 


1..335 
56..387 


171/337 (50%) 
237/337 (69%) 


e-101 
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AAY99421 

r. 


Human PROL433 (UNQ738) 
amino acid sequence SEQ ID 
NO:292 - Homo sapiens, 388 
aa. [WO200012708-A2, 
09-MAR-2000] 


1..335 
56..387 


II > OsatCJfELV 
171/337 (50%) 

237/337 (69%) 


e-101 



In a BLAST search of public sequence datbases, the NOV22a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 22D. 



Table 22D. Public BLASTP Results for NOV22a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV22a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9DCV3 


0610010B06Rik protein 
(Diacylglycerol 
acyltransferase 2) - Mus 
musculus (Mouse), 388 aa. 


1..335 
56..387 


172/337 (51%) 
238/337 (70%) 


e-101 


CAD38961 


Hypothetical protein - Homo 
sapiens (Human), 434 aa 
(fragment). 


1..335 
102..433 


171/337 (50%) 
237/337 (69%) 


e-100 


Q96PD7 


Diacylglycerol 
acyltransferase 2 
(Hypothetical 43.8 kDa 
protein) - Homo sapiens 
(Human), 388 aa. 


1..335 
56..387 


171/337 (50%) 
237/337 (69%) 


e-100 


Q8TAB1 


BA351K23.5 (Novel protein) 
- Homo sapiens (Human), 
296 aa (fragment). 


38..335 
1..295 


161/299 (53%) 
221/299 (73%) 


2e-98 


Q9BYE5 


GS1999full protein - Homo 
sapiens (Human), 297 aa. 


39..335 
2..296 


161/299 (53%) 
217/299 (71%) 


4e-96 



Example 23. 

The NOV23 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 23A. 



Table 23A. NOV23 Sequence Analysis 

ISEQIDNO: 113 |l022bp 
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NOV23a, 
CG146522-01 
DNA Sequence 



ACTGTTCTGAGATC TTTGCCTCC C TCAGGC TCCCGA< 



X3CA< 



CTTCCAGAGTCTGATGCTTCTGCAGTGGCCTTTGAGCTACCTTGCCATCTTGTTCGTCTACCTGCTG 
TTTACATCCTTGTGGCCGC TACC AGTGCTTTAC TTTGCC TGGTTGT TC CTGGAC TG GAAGACC CC AG 
AGCGAGGTGGCAGGCGTTCGGCCTGGGTAAGGAACTGGTGTGTCTGGACCCACATCAGGGACTATTT 
CCCC ATTATC CTG AAG ACAAAGG ACCTATC ACCTGAGC ACAACTAC CTCATGGGGGTTCACCC CCAT 
GGCCTCCTGACCTTTGGCGCCTTCTGCAACTTCTGCACTGAGGCCACAGGCTTCTCGAAGACCTTCC 
CAGGCATCACTCCTCACTTGGCCACGCTGTCCTGGTTCTTCAAGATCCCCTTTGTTAGGGAGTACCT 
C ATGGC C AAAGGTGTGTGCTCTGTGAGC CAGCCAGCCATCAACTATCTGC TGAGCC ATGGC ACTGGC 
AACCTCGTGGGCATTGTAGTGGGAGGTGTGGGTGAGGCCCTGCAAAGTGTGCCCAACACCACCACCC 
TCATCCTCCAGAAGCGCAAGGGGTTCGTGCGCACAGCCCTCCAGCATGGGGCTCATCTGGTCCCCAC 
CTTCACTTTTGGGGAAACTGAGGTGTATGATCAGGTGCTGTTCCATAAGGATAGCAGGATGTACAAG 
TTCCAGAGCTGCTTCCGCCGTATCTTTGGTTTCTACTGTTGTGTCTTCTATGGACAAAGCTTCTGTC 
AAGGCTCCACTGGGCTCCTGCCATACTCCAGGCCTATTGTCACTGTTGGGGAGCCTCTGCCACTGCC 
CCAAATTGAAAAGCC AAGCCAGGAGATGGTGG ACAAATACC ATGC ACTT T ATATGGATGC TC TGC AC 
AAACTGTTCGACC AGC AT AAGAC C C ACTATGGC TGC TCAGAGACCC AAAAGCTGTTTTTCCTGTGAA 
TGAAGGTACTGCATGCC 



ORF Start: ATG at 42 



ORF Stop: TGA at 1002 





SEQ ID NO: 1 14 |320 aa |MW at 36773.5kD 


NOV23a, 
CG146522-01 
Protein Sequence 


MAHSKQPSHFQSLmLQWPLSYLAILFVYLLFTSLWPLPV^ 

VWTHIRDYFPI ILKTKDLSPEHNYLMGVHPHGLLTFGAFCNFCTEATGFSKTFPGI TPHLATLSWFF 
KI PFVREYLMAKGVCSVSQPAINYLL SHGTGNLVG I WGGVGEALQS VPNTTTL ILQKKKGFVRTAL 
QHGAHLVPTFTFGETEVYDQVLFHKDSRMYKFQSCFRR I FGF YCCVFYGQSFCQGSTGLLPYSRP IV 
TVGEPLPLPQIEKPSQEMVDKYHALYIODALHKLFDQHKTHYGCSETQKLFFL 



Further analysis of the NOV23a protein yielded the following properties shown in 
Table 23B. 

10 



Table 23B. Protein Sequence Properties NOV23a 


PSort analysis: 


0.7284 probability located in outside; 0.3880 probability located in microbody 
(peroxisome); 0.1000 probability located in endoplasmic reticulum 
(membrane); 0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 43 and 44 



A search of the NOV23a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 23C. 



Table 23C. Geneseq Results for NOV23a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV23a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 



187 
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5 



/iDD / JO// 

•i 


jjicdoi pruiciij-cujs.ar yuiic 

conserved gene 1 
(BSTP-ECG1) protein - 
Homo sapiens, 388 aa. 
[WO200208260-A2, 
31-JAN-2002] 


1 p 

d *17 

62.385 


CT/USOE 
i fW*9zt (^n<fc\ 

224/324 (68%) 


le-yj 


AAB66170 


Protein of the invention #82 - 
Unidentified, 388 aa. 
[WO200078961-A1, 
28-DEC-2000] 


4.317 
62.385 


165/324 (50%) 
224/324 (68%) 


le-93 


AAU29191 


Human PRO polypeptide 
sequence #168 - Homo 
sapiens, 388 aa. 
[WO200168848-A2, 
20-SEP-2001] 


4.317 
62.385 


165/324 (50%) 
224/324 (68%) 


le-93 


AAY99421 


Human PRO 1433 (UNQ738) 
amino acid sequence SEQ ID 
NO:292 - Homo sapiens, 388 
aa. [WO200012708-A2, 
09-MAR-2000] 


4.317 
62.385 


165/324 (50%) 
224/324 (68%) 


le-93 


AAY94889 


Human protein clone 
HP02485 - Homo sapiens, 
334 aa. [WO200005367-A2, 
03-FEB-2000] 


11.319 
16.333 


144/318 (45%) 
200/318 (62%) 


3e-74 


In a BLAST search of public sequence datbases, the NOV23a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 23D. 


Table 23D. Public BLASTP Results for NOV23a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV23a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q8TAB1 


BA351K23.5 (Novel protein) 
- Homo sapiens (Human), 
296 aa (fragment). j 


30.317 
3..293 


163/291 (56%) 
214/291 (73%) 


5e-96 


Q9DCV3 


0610010B06Rik protein 
(Diacylglycerol 
acyltransferase 2) - Mus 
musculus (Mouse), 388 aa. 


4.317 ( 
62.385 


166/324(51%) 
225/324 (69%) 


2e-93 


CAD38961 


Hypothetical protein - Homo 
sapiens (Human), 434 aa 
(fragment). 


4.317 
108..431 


165/324 (50%) 
224/324 (68%) 


3e-93 
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Q96PD7 


Diacylglycerol 
acyltransferase 2 
(Hypothetical 43.8 kDa 
protein) - Homo sapiens 
(Human), 388 aa. 


1 gs* 

4..317 
62..38S 


ifci'T^'-'iHScO'iEi:".. 
165/324 (50%) 
224/324 (68%) 


•wl) «uILl Mujt jft « 

3e-93 


Q9BYE5 


GS1999full protein - Homo 
sapiens (Human), 297 aa. 


28..317 
1..294 


156/294 (53%) 
210/294 (71%) 


le-89 



Example 24. 

The NOV24 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 24A. 



Table 24A. NOV24 Sequence Analysis 




SEQIDNO:115 |l056bp | 


NOV24a, 
CG146531-01 
DNA Sequence 


CATTTTCCAAAGGTGTCACAGGAAGAGCATGGCAGAGCTGGGACTGGGAGCCAGGTCACCATGGCTT 


TC TTCTCCCG ACTGAATCTCCAGGAGGGCCTCC AAAC C TTCTTTGTTTTGCAATGGATCCCAGTC T A 
TATATTTTTAGGTTTGTTCGTCTACCTGCTGTTTACATCCTTGTGGCCGCTACCAGTGCTTTACTTT 
GCCTGGTTGTTCCTGGACTGGAAGACCCCAGAGCGAGGTGGCAGGCGTTCGGCCTGGGTAAGGAACT 
GGTGTGTCTGGACCCACATC AGGGACTATTTCCCC ATTCAGATC C TGAAGAC AAAGGACCTATCACC 
TGAGCACAACTACCTCATGGGGGTTCACCCCCATGGCCTCCTGACCTTTGGCGCCTTCTGCAACTTC 
TGCACTGAGGCCACAGGCTTCTCGAAGACCTTCCCAGGCATCACTCCTCACTTGGCCACGCTGTCCT 
GGTTC TTCAAGATCCCCTTTGTT AGGGAGTACC TC ATGGCC AAAGGTGTGTGCTC TGTGAGCC AGCC 
AGCCATCAACTATCTGCTGAGCCATGGCACTGGCAACCTCGTGGGCATTGTAGTGGGAGGTGTGGGT 
GAGGCCCTGCAAAGTOTGCCCAACACCACCACCCTCATCCTCCAGAAGCGCAAGGGGTTCGTGCGCA 
C^GCCCTCCAGCATGGGGCTCATCTGGTCCCCACCTTCACTTTTGGGGAAACTGAGGTGTATGATCA 
GGTGCTGTTCCATAAGGATAGCAGGATGTACAAGTTCCAGAGCTGCTTCCGCCGTATCTTTGGTTTC 
TACTGTTGTGTCTTCTATGGACAAAGCTTCTGTCAAGGCTCCACTGGGCTCCTGCCATACTCCAGGC 
CTATTGTCACTGTTGGGGAGCCTCTGCCACTGCCCCAAATTGAAAAGCCAAGCCAGGAGATGGTGGA 
C AAATACCATGCAC TTTATATGGATGC TCTGC AC AAAC TGTTCGACCAGCATAAG ACCCACTATGGC 
TGCTCAGAGACCCAAAAGCTGTTTTTCCTGTGAATGAAGGTACTGCATGCC 




ORF Start: ATG at 61 j |ORF Stop: TGA at 1036 





SEQ ID NO: 1 16 |325 aa MW at 37453.3kD 


NOV24a, 
CG146531-01 
Protein Sequence 


MAFFSRLNLQEGLQTFFVLQWI PVYI FLGLFVYLLFTSLWPLPVL YFAWLFLDWKTPERGGRRSAWV 
RNWCVWTHIRDYFPIQILKTKDLSPEHNYLMGVHPHGLLTFGAFCNFCTEA 

LSWFFKI PFVREYLMAKGVC SVSQPAINYLLSHGTGNLVGI VVGGVGEALQSVPNTTTLILQKRKGF 
VRTALQHGAHLVPTFTFGETEVYDQVLFHKDSRMYKFQSCFRRIFGFYCCVFYGQSFCQGSTGLLPY 
SRPIVTVGEPLPLPQIEKPSQEMVDKYHALYMDALHKLFDQHKTHYGCSETQKLFFL 



Further analysis of the NOV24a protein yielded the following properties shown in 
Table 24B. 



Table 24B. Protein Sequence Properties NOV24a 
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PSort analysis: 

■i 


. n rt tH' (Liv U / O lUf '{C; 1 ".' 1 ' *3 JL 
0.8200 probability located in outside; 0.3880 probability located in microbody 

(peroxisome); 0.1000 probability located in endoplasmic reticulum 

(membrane); 0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 47 and 48 



A search of the NOV24a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
5 several homologous proteins shown in Table 24C. 



Table 24C. Geneseq Results for NOV24a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV24a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABB75677 


Breast protein-eukaryotic 
conserved gene 1 
(BSTP-ECG1) protein - 
Homo sapiens, 388 aa. 
[WO200208260-A2, 
31-JAN-2002] 


1..322 
56.-385 


166/330(50%) 
230/330(69%) 


2e-96 


AAB66170 


Protein of the invention #82 - 
Unidentified, 388 aa. 
[WO200078961-A1, 
28-DEC-2000] 


1..322 
56..38S 


166/330(50%) 
230/330 (69%) 


2e-96 


AAU29191 


Human PRO polypeptide 
sequence #168 - Homo 
sapiens, 388 aa. 
[WO200168848-A2, 
20-SEP-2001] 


1..322 
56.-385 


166/330 (50%) 
230/330 (69%) 


2e-96 


AAY99421 


Human PR01433 (UNQ738) 
amino acid sequence SEQ ID 
NO:292 - Homo sapiens, 388 
aa. [WO200012708-A2, 
09-MAR-2000] 


1..322 
56..385 


166/330 (50%) 
230/330 (69%) 


2e-96 


AAY94889 


Human protein clone 
HP02485 - Homo sapiens, 
334 aa. [WO200005367-A2, 
03-FEB-2000] 


13..324 
15..333 


147/321(45%) 
200/321 (61%) 


2e-75 



10 In a BLAST search of public sequence datbases, the NOV24a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 24D. 
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Table 24D. Public BLASTP Results for NOV24a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV24a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q8TAB1 


BA351K23.5 (Novel protein) 
- Homo sapiens (Human), 
296 aa (fragment). 


34..322 
3..293 


163/291 (56%) 
215/291 (73%) 


le-97 


CAD38961 


Hypothetical protein - Homo 
sapiens (Human), 434 aa 
(fragment). 


1..322 
102..431 


166/330 (50%) 
230/330 (69%) 


6e-96 


Q9DCV3 


0610010B06Rik protein 
(Diacylglycerol 
acyltransferase 2) - Mus 
musculus (Mouse), 388 aa. 


1..322 
56..385 


167/330 (50%) 
230/330 (69%) 


6e-96 


Q96PD7 


Diacylglycerol 
acyltransferase 2 
(Hypothetical 43.8 kDa 
protein) - Homo sapiens 
(Human), 388 aa. 


1..322 
56..385 


166/330 (50%) 
230/330 (69%) 


6e-96 


Q9BYE5 


GS1999full protein - Homo 
sapiens (Human), 297 aa. 


32..322 
1..294 


157/294 (53%) 
211/294(71%) 


le-91 



5 Example 25. 

The NOV25 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 25A. 



Table 25A. NOV25 Sequence Analysis 




SEQIDNO:117 |951 bp f 


NOV25a, 
CG147274-01 
DNA Sequence 


ATGGGGCTTCGGGCAGGCCCCATCCTGCTTCTGCTGCTGTGGCTGCTGCCAGGGGCCCATTGGGATG 
TGCTGCCTTCAGAATGCGGCCACTCCAAGGAGGCCGGGAGGATTGTGGGAGGCCAAGACACCCAGGA 
AGGACGCTGGCCGTGGCAGGTTGGCCTGTGGTTGACCTCAGTGGGGCATGTATGTGGGGGCTCCCTC 
ATCCACCCACGCTGGGTGCTCACAGCCGCCCACTGCTTCCTGAGGTCTGAGGATCCCGGGCTCTACC 
ATGTTAAAGTCGGAGGGCTGACACCCTCACTTTCAGAGCCCCACTCGGCCTTGGTGGCTGTGAGGAG 
GCTCCTGGTCCACTCCTCATACCATGGGACCACCACCAGCGGGGACATTGCCCTGATGGAGCTGGAC 
TCCCCCTTGCAGGCCTCCCAGTTCAGCCCCATCTGCCTCCCAGGACCCCAGACCCCCCTCGCCATTG 
GGACCGTGTGCTGGGTAAACGGGCTGGGGCCCACATCACATCCAGCCCTGGCGAGTGTCCTTCAGGA 
GGTGGCTGTGCCCCTCCTGGACTCGAACATGTGTGAGCTGATGTACCACCTAGGAGAGCCCAGCCTG 
GCTGGCCAGCGCCTCATCCAGGACGACATGCTCTGTGCTGGCTCTGTCCAGGGCAAGAAAGACTCCT 
GCCAGGGTGACTCCGGGGGGCCGCTGGTCTGCCCCATCAATGATACGTGGATCCAGGCCGGCATTGT 
GAGCTGGGGATTCGGCTGTGCCCGGCCTTTCCGGCCTGGTGTCTACACCCAGGTGCTAAGCTACACA 
GACTGGATTCAGAGAACCCTGGCTGAATCTCACTCAGGCATGTCTGGGGCCCGCCCAGGTGCCCCAG 
GATCCCACTCAGGCACCTCCAGATCCCACCCAGTGCTGCTGCTTGAGCTGTTGACCGTATGCTTGCT 
TGGGTCCCTGTGA 




ORF Start: ATG at 1 jORF Stop: TGA at 949 
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SEQ ID NO: 1 18 (316 aa jMW at 33574.2WD 


NOV25a, 
CG147274-01 
Protein Sequence 


MGLRAGPILLLLLTOLPGAHWDVLPSE 

I HPRWVTjTAAHCFIiRS EDPGL YHVKVGGLT PSL S E PHSAL VAVRRLLVHS S YHGTTTSGDIALMELD 
SPLQASQFSPICLPGPQTPLAIGTVCWVNGLGPTSHPAIASVLQEVAVPLLDSNMCELMYHLGEPSL 
AGQRLIQDDMLCAGSVQGKKDSCQGDSGGPLVCPIlSrDTWIQAGIVSWGFGCARPFRPGVyTQVLSYT 
DWIQRTLAESHSGMSGARPGAPGSHSGTSRSHPVLLLELLTVCLLGSL 



5 

Further analysis of the NOV25a protein yielded the following properties shown in 
Table 25B. 

10 



Table 25B. Protein Sequence Properties NOV25a 


PSort analysis: 


0.9190 probability located in plasma membrane; 0.3000 probability located in 
lysosome (membrane); 0.1000 probability located in endoplasmic reticulum 
(membrane); 0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 22 and 23 



A search of the NOV25a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
15 several homologous proteins shown in Table 25C. 
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Tabie 25C. Geneseq Results for NOV25a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV25a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAU98887 


Human protease PRTS5 - 
Homo sapiens, 304 aa. 
[WO200238744-A2, 
16-MAY-2002] 


1..316 
1..304 


304/316 (96%) 
304/316 (96%) 


0.0 


AAW77303 


Amino acid sequence of 
SP002LA, a homologue of 
HELA2 - Homo sapiens, 289 
aa. [WO9836054-A1, 
20-AUG-1998] 


28..316 ' 
1..289 


285/289 (98%) 
285/289 (98%) 


e-171 


ABG64545 


Human albumin fusion 
protein #1220 - Homo 
sapiens, 290 aa. 
[WO200177137-A1, 
18-OCT-2001] 


5..275 
6..276 


121/275 (44%) 
168/275 (61%) 


le-63 


AAB73945 


Human protease T - Homo 
sapiens, 290 aa. 
[WO200116293-A2, 
08-MAR-2001] 


5..275 
6„276 


121/275 (44%) 
168/275 (61%) 


le-63 


AAE03821 


Human gerie 4 encoded 
secreted protein HWH1H10, 
SEQ ID NO: 67 -Homo 
sapiens, 290 aa. 
[WO200136440-A1, 
25-MAY-2001] 


5-275 
6.276 


121/275 (44%) 
168/275 (61%) 


le-63 



In a BLAST search of public sequence datbases, the NOV25a protein was found to 
5 have homology to the proteins shown in the BLASTP data in Table 25D. 



Table 25D. Public BLASTP Results for NOV25a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV25a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q91XC4 


Similar to distal intestinal 
serine protease - Mus 
musculus (Mouse), 310 aa. 


1.316 
1.310 


202/317 (63%) 
235/317 (73%) 


e-114 
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Q9QYZ9 


Distal intestinal serine 
protease - Mus musculus 
(Mouse), 310 aa. 


1..316 
1..310 


201/317 (63%) 
233/317 (73%) 


«Mf!.t ««Jl«.. UUmP .JP" 1 

e-113 


Q9BQR3 


Marapsin precursor (EC 
3.4.21.-) - Homo sapiens 
(Human), 290 aa. 


5..275 
6..276 


121/275 (44%) 
168/275 (61%) 


3e-63 


Q8R1A6 


RIKEN cDNA 2010001P08 
gene - Mus musculus 
(Mouse), 331 aa. 


24..305 
41. .329 


142/293 (48%) 
174/293 (58%) 


5e-62 


Q9DGR3 


Embryonic serine protease- 1 - 
Xenopus laevis (African 
clawed frog), 317 aa. 


25..304 
29..308 


123/288 (42%) 
165/288 (56%) 


le-59 



PFam analysis predicts that the NOV25a protein contains the domains shown in the 
Table 25E. 
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Table 25E. Domain Analysis of NOV25a 


Pfam Domain 


NOV25a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


trypsin 


37.-27 1 


109/266 (41%) 
176/266 (66%) 


1.7e-79 



Example 26. 

10 The NOV26 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 26A. 



Table 26A. NOV26 Sequence Analysis 




SEQ ID NO: 119 970 bp 


NOV26a, 
CG147351-01 
DNA Sequence 


CACAAGAACAATATGCAGCTGAGATGAGTAAAGCTATTGCTTTTGAGATCATTCAGAAATACGAGCC 
TATCGAAGAAGTTAGGAAAGCACACCAAATGTCATTAGAAGGTTTTACAAGATACATGGATTCACGT 
GAATGTCTACTGTTTAAAAATGAATGTAGAAAAGTTTATCAAGATATGACTCATCCATTAAATGATT 
ATTTTATTTCATCTTCACATAACACATATTTGGTATCTGATCAATTATTGGGACCAAGTGACCTTTG 
GGGATATGTAAGTGCCCTTGTGAAAGGATGC CGTTGTTTGGAGATTGAC TGC TGGGATGGAGCACAA 
AATGAACCTGTTGTATATCATGGCTAGACACTCACAAGO^CTTCTGTTTAAAACTGTTATCCAAG 
CTATACACAAGTATGCATTCATGGTGGCTTTAAATTTCCAGACCCCTGGTCTGCCCATGGATCTGCA 
AAATGGGAAATTTTTGGATAATGGTGGTTCTGGATATATTTTGAAACCACATTTCTTAAGAGAGAGT 
AAATCATACTTTAACCC^GTAACATAATiAGAGGGTATGCCAATTACACTTACAATAAGGCTCATCA 
GTGGTATCCAGTTGCCTCTTACTCATTCATCATCTAACAAAGGTGATTCATTAGTAATTATAGAAGT 
TTTTGGTGTTCCAAATGATCAAATGAAGCAGCAGACTCGTGTAATTAAAAAAAATGCTTTTAGTCCA 
AGATGGAATGAAACATTCACATTTATTATTCATGTCCCAGAATTGGCATTGATACGTTTTGTTGTTG 
AAGGTCAAGGTTTAATAGCAGGAAATGAATTTCTTGGGCAATATACTTTGCCACTTCTATGCATGAA 
CAAAGGTTATCGTCGTATTC CTCTGTTTTCCAG AATGGGTGAGAGCC TTG AGCCTGCTTC ACTGTTT 
GTTTATGTTTGGTACGTCAGATAACAGCTAAG 
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|ORF Start: ATG at 24 | gggj l! I T 





SEQIDNO: 120 |312aa 


MW at 35720.0KD 


NOV26a, 
CG147351-01 
Protein Sequence 


MSKAIAFEI I QKYEPIEEVRKAHQMSLEGFTRYMDSRECLLFKOTCRKVYQDMTHPLNDYFISS SHN 
TYLVSDQLIX3PSDLWGYVSALVKGCRCLEIDCWDGAQNEPVVYHGYTLTSKLLFKWIQAIHKYAFM 
VALNFQTPGLPMDLQNGKFLDNGGSGYILKPHFLRESKSYFNPSNIKEGMPITLTIRLISGIQLPLT 
HSSSNKGDSLVI I EVFGVPNDQMKQQTRVIKKNAF SPRWNETFTF I IHVPELALIRFWEGQGLI AG 
NEFLGQYTLPLLCMNKGYRR I PLFSRMGESLEPASLFVYVWYVR 



Further analysis of the NOV26a protein yielded the following properties shown in 
Table 26B. 
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Table 26B. Protein Sequence Properties NOV26a 


PSort analysis: 


0.5844 probability located in microbody (peroxisome); 0,1814 probability 
located in lysosome (lumen); 0.1000 probability located in mitochondrial 
matrix space; 0.0000 probability located in endoplasmic reticulum 
(membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV26a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 26C. 



Table 26C- Geneseq Results for NOV26a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV26a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAU76817 


Human phospholipase C 
16839 polypeptide - Homo 
sapiens, 608 aa. 
[WO200206302-A2, 
24-JAN-2002] 


134..312 
430..608 


179/179 (100%) ' 
179/179 (100%) 


e-101 


ABB90425 


Human polypeptide SEQ ID 
NO 2801 - Homo sapiens, 
179 aa. [WO200190304-A2, 
29-NOV-2001] 


134..312 
1..179 


179/179 (100%) 
179/179 (100%) 


e-101 
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AAU87271 

i 


Novel centra] nervous 
system protein #181 - Homo 
sapiens, 254 aa. 
[WO200155318-A2, 
02-AUG-2001] 


ip 

134..312 
76..254 


lb- 11" y>- MtsttSn^7 
179/179(100%) 

179/179 (100%) 


JL «J« j- ' i 
e-101 


AAM95867 


Human reproductive system 
related antigen SEQ ID NO: 
4525 - Homo sapiens, 254 
aa. [WO200155320-A2, 
02-AUG-2001] 


134..312 
76..254 


178/179 (99%) 
178/179 (99%) 


e-100 


AAU22938 


Novel human enzyme 
polypeptide #24 - Homo 
sapiens, 254 aa. 
[WO200155301-A2, 
02-AUG-2001] 


134..312 
76.-254 


178/179 (99%) 
178/179 (99%) 


e-100 



In a BLAST search of public sequence datbases, the NOV26a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 26D. 
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Table 26D. Public BLASTP Results for NOV26a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV26a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


BAC05152 


CDNA FLJ40406 fis, clone 
TESTI2037534, weakly similar to 
l-PHOSPHATIDYLINOSITOL-4,5-B 
ISPHOSPHATE 

PHOSPHODIESTERASE DELTA 1 
(EC 3.1.4.11) - Homo sapiens 
(Human), 390 aa. 


134..312 
212..390 


179/179 (100%) 
179/179 (100%) 


e-101 


Q96J70 


Testis-development related NYD-SP27 
- Homo sapiens (Human), 504 aa. 


134..312 
326..504 


178/179 (99%) 
178/179 (99%) 


e-100 


Q95JS0 


Hypothetical 74.4 kDa protein - 
Macaca fascicularis (Crab eating 
macaque) (Cynomolgus monkey), 640 
aa. 


134..312 
462..640 


172/179 (96%) 
177/179 (98%) 


2e-97 


Q95JS1 


Hypothetical 74.6 kDa protein - 
Macaca fascicularis (Crab eating 
macaque) (Cynomolgus monkey), 641 
aa. 


134.312 
463..641 


172/179 (96%) 
177/179 (98%) 


2e-97 


AAM95914 


PLC-zeta - Mus musculus (Mouse), 
647 aa. 


134..312 
467..646 


135/181 (74%) 
158/181 (86%) 


7e-73 
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PFam analysis predicts that the NOV26a protein contains the domains shown in the 
Table 26E. 



Table 26E. Domain Analysis of NOV26a 


Pfam Domain 


NOV26a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


PI-PLC-X 


52.. 133 


45/83 (54%) 
66/83 (80%) 


4.3e-36 


PI-PLC-Y 


134.. 169 


25/43 (58%) 
33/43 (77%) 


2.9e-17 


C2 


1 88-276 


33/97(34%) 
73/97(75%) 


4.9e-20 



Example 27. 

The NOV27 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 27A. 



Table 27A. NOV27 Sequence Analysis 




SEQE>NO:121 |3136bp ]_ 


NOV27a, 
CG147419-01 
DNA Sequence 


AGGGAGTCGTGTCGGCGCCACCCCGGCCCCCGAGCCCGCAGATTGCCCACCG7^AGCTCGTGTGTGCA 


CCCCCGATCCCGCCAGCCACTCGCCCCTGGCCTCGCGGGCCGTGTCTCCGGCATCATGTGTGGTATA 


TTTGCTTACTTAAACTACCATGTTCCTCGAACGAGACGAGAAATCCTGGAGACCCTAATCAAAGGCC 
TTC AGAGAC TGGAGTACAGAGGAT ATGATTC TGCTGGTGTGGG ATT TGATGGAGGC AATGATAAAGA 
TTGGGAAGC CAATGCC TGC AAAACCC AGCTTATTAAGAAGAAAGGAAAAGTTAAGGC ACTGGATGAA 
GAAGTTCACAAGCAACAAGATATGGATTTGGATATAGAATTTGATGTACACCTTGGAATAGCTCATA 
CCCGTTGGGCAAC AC ATGGAGAAC CC AGTCC TGTC AAT AGCC ACCC CC AGCGCTCTG ATAAAAATAA 
TGAATTTATCGTTATTCACAATGGC ATCATC AC CAACTACAAAGAC TTGAAAAAGTTTTTGGAAAGC 
AAAGGCTATGACTTCGAATCTGAAACAGACACAGAGACAATTGCCAAGCTCGTTAAGTATATGTATG 
ACAATCGGGAAAGTCAAGATACCAGCTTTACTACCTTGGTGGAGAGAGTTATCCAACAATTGGAAGG 
TGCTTTTGCACTTGTGTTTAAAAGTGTTCATTTTCCCGGGCAAGCAGTTGGCACAAGGCGAGGTAGC 
CCTCTGTTGATTGGTGTACGGAGTGAACATAAACTTTCTACTGATCACATTCCTATACTCTACAGAA 
CAGCTAGGACTCAGATTGGATCAAAATTCACACGGTGGGGATCACAGGGAGAAAGAGGCAAAGACAA 
GAAAGGAAGCTGC^TCTCTCTCGTGTGGACAGCTVC^CCTGCCTTTTCCCGGTGGAAGAAAAAGCA 
GTGGAGTATTACTTTGCTTCTGATGCAAGTGCTGTCATAGAACACACCAATCGCGTCATCTTTCTGG 
AAGATGATGATGTTGCAGCAGTAGTGGATGGACGTCTTTCTATCCATCGAATTAAACGAACTGCAGG 
AGATCACCCCGGACGAGCTGTGCAAACACTCCAGATGGAACTCCAGCAGATCATGAAGGGCAACTTC 
AGTTCATTTATGCAGAAGGAAATATTTGAGCAGCCAGAGTCTGTCGTGAACACAATGAGAGGAAGAG 
TCAACTTTGATGACTATACTGTGAATTTGGGTGGTTTGAAGGlATCACATAAAGGAGATCCAGAGATG 
CCGGCGTTTGATTCTTATTGCTTGTGGAACAAGTTACCATGCTGGTGTAGCAACACGTCAAGTTCTT 
GAGGAGCTGACTGAGTTGCCTGTGATGGTGGAACTAGCAAGTGACTTCCTGGACAGAAACACACCAG 
TCTTTCGAGATGATGTOTGCTTTTTCCTTAGTCAATCAGGTGAGACAGCAGATACTTTGATGGGTCT 
TCGTTACTGTAAGGAGAGAGGAGCTTTAACTGTGGGGATCACAAACACAGTTGGCAGTTCCATATCA 
CGGGAGACAGATTGTGGAGTTCATATTAATGCTGGTCCTGAGATTGGTGTGGCCAGTACAAAGGCTT 
ATAC^GCCAGTTTGTATCCCTTGTGATGTTTGCCCTTATGATGTGTGATGATCGGATCTCCATGCA 
AGAAAGACGCAAAGAGATCATGCTTGGATTGAAACGGCTGCCTGATTTGATTAAGGAAGTACTGAGC 
ATGGATGACGAAATTCAGAAACTAGCAACAGAACTTTATCATCAGAAGTCAGTTCTGATAATGGGAC 
GAGGC TATCATTATGCT AC TTGTCTTGAAGGGGCAC TGAAAATCAAAGAAATTAC TTATATGCACTC 
TGAAGGCATCCTTGCTGGTGAATTGAAACATGGCCCTCTGGCTTTGGTGGATAAATTGATGCCTGTG 
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a rnp a a tp 7i tp a tg a g ag a tp a p a p tt a tp.pp a a r$rnHv a f£ A'A^GT^r^PiP AT2PA a ftT^^T'T^p* tP' 
pp.p T^cznaczcaaoc tgtggtaa tttgtg at a agg a gg at aptgagappattaagaapapa a a aag a ap 

f2 ATP A Af2flTf2PPPP APTP AGTflG APTGPTTGP AftlPflP ATTPTP AGPGTG ATPPPTTT AP APTTGPTfS 
GPTTTPPAPPTTGPTGTGPTnAGAGGPTATGATGTTGATTTPPPAPGGAATPTTGPPAAATPTGTrjA 
PTGTAGAGTGAGGA ATATPTATAPAAAATGTAPGAAAPTGTATGATTAAGPAACAPAAGAPAPPTTT 


TGTATTTA A AAPPTTGATTTA AAATATPAPPPPTTGAAGPPTTTTTTTAGTAAATPPTTATTTATAT 


ATP A^TTATA ATTATTPPAPTPA ATATGTGAT^TTTGTGA AGTTAPPTPTTAPATTTTPPPAfSTA AT 


TTGTGGAGGAPTTTGAATAATGGAATPTATATTGGAATPTGTATPAGAAAGATTCTAGPTATTATTT 


TPTTTA A AG A ATGPTGGGTGTTGP ATTTPTGGAPPPTPP APTTP A ATPTGAGA AGAPA AT ATGTTTP 

X \_> X X InnAu/viJLOV^ X uUVJ X U X X uUnl X lu X X V-» \— rlv^ x X Lnni x unonnvj/iunn inly XXX v— 


T AAAAATTGGTACTTGTTTC ACCATAC T TC ATTC AGACC AGTG AAAGAG T AGTGC ATTTAATTGGAG 


TATCTAAAGCCAGTGGC AGTGTATGC TCATACT TGGAC AGTTAGGGAAGGGTTTGCCAAGTTT TAAG 


AGAAGATG TCATTTATTT TG AAATTTCTCT TGTAAAAC TTAAAAC 


TGAAAAATTTTATTGGTAGGATTTATATCTAAGTTTGGTTAGCCTTAGTTTCTCAGACTTGTTGTCT 


ATOATCTCTAGGTGGAAGAAATTTAGGAAGCGAAATAOT 


TTAACATATTTGCACAATTTTATAGCACAAACTTTAAATTCAAGCTGCTTTGGACAACTGACAATAT 


GATTTTAAATTTGAAGATGGGATGTGTACATGTTGGGTATCCTACTACTTTGTGTTTTCATCTCCTA 


AAAGTGTTxyrTTATTOCCTTGTATCTGTAGTCTTraATTTTTTAAATGACTGCTGAATGACATATTT 


T ATCTTGTTC TTT AAAATC ACAACACAG AGC TGCT ATTAAATTAATAT TGATAT 




ORF Start: ATG at 123 j jORF Stop: TGA at 2220 





SEQ ID NO: 122 |699aa 


MWat78793.6kD 


NOV27a, 
CG147419-01 
Protein Sequence 


MCG I FAYLNYHVPRTRRE I LETL I KGLQRLE YRGYDS AGVGFDGGlTOKDVnSANACCT 

ALDEEVHKQQDMDLDIEFDVHLG:^ 

FLESKGYDFESETDTETIAKLVKxTi^^ 

RRGSPLLIGVRSEIiKLSTDHIPILYRTAR^ 

EEKAVEYYFAiSDASAVIl^TNRVIF 

kgotssfmqkeifeqpesvvntiv^^ 

RQVLEELTELPVMVELASDFLDROTPVFRD^^ 

SSI SRETDCGVHII^GPEIGVASTKAYTSQFVSLVMFALMMCDDRI SMQERRKEIMLGLKRLPDLIK 
EVLSMDDEIQKlxATELYHQKSVL^ 

LMPVIMIIMRDHTYAKCQNAiQQWARQGRPWICTKro 
LQLL AFHIiAVLRG YDVDF PRNLAKS VTVE 



Further analysis of the NOV27a protein yielded the following properties shown in 
Table 27B. 
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Table 27B. Protein Sequence Properties NOV27a 


PSort analysis: 


0.4902 probability located in mitochondrial inner membrane; 0.4400 
probability located in plasma membrane; 0.3000 probability located in 
microbody (peroxisome); 0.2000 probability located in endoplasmic reticulum 
(membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV27a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 27C. 
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Table 27C. Geneseq Results for NOV27a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV27a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABB05747 


Human GFAT1L protein 
SEQIDNO:1-Homo 
sapiens, 699 aa. 
[WO200196574-A1, 
20-DEC-20011 


1..699 
1..699 


698/699 (99%) 
698/699 (99%) 


0.0 


AAY90260 


Human GFAT protein 
sequence - Homo sapiens, 
681 aa. [WO200037617-A1, 
29-JUN-20001 


1..699 
1..681 


681/699 (97%) 
681/699 (97%) 


0.0 


AAR43348 


Human GFAT - Homo 
sapiens, 681 aa. 
[WO9321330-A, 
28-OCT-1993] 


1..699 
1..681 


680/699 (97%) 
680/699 (97%) 


0.0 


AAY90261 


Human GFAT II protein 
sequence - Homo sapiens, 
682 aa. [WO200037617-A1, 
29-JUN-2000] 


1..699 
1..682 


541/701 (77%) 
618/701 (87%) 


0.0 


AAW37772 


Huma 

glutamine:fructose-6-phosph 
ate amidotransferase 
TGC028-4 - Homo sapiens, 
682 aa. [EP824149-A2, 
18-FEB-1998] 


1..699 
1..682 


541/701 (77%) 
618/701 (87%) 


0.0 



5 In a BLAST search of public sequence datbases, the NOV27a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 27D. 



Table 27D. Public BLASTP Results for NOV27a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV27a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q99MJ4 


Glutamine: fructose-6-phosphate 
amidotransferase 1 muscle 
isoform GFAT1M - Mus 
musculus (Mouse), 697 aa. 


1..699 
1..697 


688/699 (98%) 
690/699 (98%) 


0.0 
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A45055 


glutamine— fructose-6-phosphate 
transaminase (isomerizing) (EC 
2.6.1.16) - human, 681 aa. 


K 

1..699 
1..681 


T./"ftJQOiii./i 
681/699 (97%) 

681/699 (97%) 


««Ji 0L. .A 1 . « 

0.0 


Q06210 


Glucosamine— fructose-6-phospha 
te aminotransferase [isomerizing] 

aminotransferase 1) 
(D-fructose-6- phosphate 
amidotransferase 1) (GFAT 1) 
(GFAT1) - Homo sapiens 
(Human), 680 aa. 


2..699 
1..680 


680/698 (97%) 
680/698 (97%) 


0.0 


BAB31882 


Gfptl protein - Mus musculus 
(Mouse), 681 aa. 


1..699 
1..681 


674/699 (96%) 
676/699 (96%) 


0:0 


P47856 


Glucosamine— fructose-6-phospha 
te aminotransferase [isomerizing] 
1 fRC 2 6 1 \6\ fHexoseohosohate 
aminotransferase 1) 
(D-fructose-6- phosphate 
amidotransferase 1)(GFAT 1) 
(GFAT1) - Mus musculus 
(Mouse), 680 aa. 


2..699 
1..680 


673/698 (96%) 
675/698 (96%) 


0.0 



PFam analysis predicts that the NOV27a protein contains the domains shown in the 
Table 27E. 

5 



Table 27E. Domain Analysis of NO V27a 


Pfam Domain 


NOV27a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


GATase_2 


2..210 


91/219 (42%) 
202/219(92%) 


4.6e-127 


SIS 


378..512 


52/156 (33%) 
118/156(76%) 


2.2e-48 


SIS 


S49..685 


52/156 (33%) 
124/156(79%) 


3.3e-46 



Example 28. 

10 The NOV28 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 28 A. 
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Table 28A. NOV28 Sequence Analysis 




SEQIDNO:123 2521 bp \_ 


NOV28a, 
CG148102-01 
DNA Sequence ' 


ACTCTGCCCGACTCAGGGCTCCAGCGTGACATGGCTGAAGCGCACCAGGCCGTGGGCTTCCGACCCT 


CGC TGACC TC GGACGGGGC TGAAGTGGAACTC AGTGCCCC TGTGC TGC AGG AG ATC T ACC TCTCTGG 
CCTGCGCTCCTGGAAAAGGCATCTCTCACGTTTCTGGGTGCAGAATGACTTTCTCACCGGTGTGTTT 
CCTGCCAGCCCCCTCAGTTGGCTTTTCCTCTTCAGTGCCATCCAGCTTGCCTGGTTCCTCCAGCTGG 
ATCCTTCCTTAGGACTGATGGAGAAGATCAAAGAGTTGCTGCGGGGGGTCCTGGCAGCCGCGCTGTT 
TGCCTCGTGTTTGTGGGGAGCCCTGATCTTCACACTGCACGTGGCCCTGAGGCTGCTTCTGTCCTAC 
CACGGCTGGCTTCTTGAGCCCCACGGAGCCATGTCCTCCCCCACCAAGACCTGGCTGGCCCTGGTCC 
GCATCTTCTCTGGCCGCCACCCGATGCTGTTCAGTTACCAGCGCTCCCTGCCACGCCAGCCCGTGCC 
CTCTGTGCAGGACACCGTGCGCAAGTACCTGGAGTCGGTCCGGCCCATCCTCTCCGACGAGGACTTC 
GACTGGACCGCGGTCCTGGCGCAGGAATTCCTGAGGCTGCAGGCGTCGCTGCTGCAGTGGTACCTGC 
GGCTCAAGTCCTGGTGGGCGTCCAATTATGTGAGTGACTGGTGGGAGGAATTTGTGTACCTGCGCTC 
CCGAAATCCGC TGATGGTGAAC AGCAAC TATTAC ATGATGGAC TTCCTGTATGTC AC ACCCAC GCCT 
CTGCAGGCAGCTCGCGCTGGGAATGCCGTCCATGCCCTCCTCCTGTACCGCCACCGCCTGAACCGCC 
AGGAGATACCCCCGGTGAGACTGATGGGAATGCGCCCCTTATGCTCTGCCCAGTACGAGAAGATCTT 
CAACACCACGCGGATTCCAGGGGTCCAAAAAGGTGAGACCATCCGCCACCTCCATGACAGCCAACAC 
GTGGCTGTCTTCCACCGGGGCCGATTCTTCCGCATGGGGACCCACTCCCGAAACAGCCTGCTTTCCC 
CGAGAGCCCTGGAGCAGCAGTTTCAGAGAATCCTGGATGATCCCTCACCGGCCTGCCCCCACGAGGA 
ACATCTGGCAGCTCTGACAGCTGCTCCCAGGGGCACGTGGGCCCAGGTGCGGACATCCCTGAAGACC 
CAGGCAGCGGAGGCCCTGGAGGCGGTGGAAGGGGCCGCTTTCTTTGTGTCACTGGATGCTGAGCCCG 
C GGGGC T CAC CAGGG AGGAC C C GGCAGCG TC G TTGGATG C C TACGC CC ATG C TC TGC TG GC CGGC C G 
GGGCCATGATCGGTGG TTTGAC AAATCCTTC ACCCTAATCGTC TTCTC TAACGGGAAGCTGGGCC TC 
t\ r^r*r*nyc*ri7ir*r*TLf^TT\f^r t n>r!i{^r t r*c > c* TiCVCCCCC a TCtVC 'hnn'hC'AC^. ^ri^r2riCZTi.{^ , ^ r T t C'hC r VC r VCiriC r P7^.C'h C 

AATuLTU J.\,A«L 1 VjvjVjL. J. Al- 1 L.AAUAksAV_ki<jV-L.Al- 1 \^AAV3*jrkJol_A^V-V_va^/4LA^^ nv-A^ X AV-V-V-k-A 

GCCAAGATCTTGTCTGAAAATGTCGACTGCCATGTCGTTCCATTCTCCCTATTTGGCAAGAGCTTCA 
TCCGACGCTGCCACCTCTCTTCAGACAGCTTCATCCAGATCGCCTTGCAACTGGCCCACTTCCGGGA 
CCCACAGTGCCTCGCCCTGTTCCGCGTGGCAGTGGACAAGCACCAGGCTCTGCTGAAGGCAGCCATG 
AGCGGGCAGGGAGTTGACCGCCACCTGTTTGCGCTGTACATCGTGTCCCGATTCCTCCACCTGCAGT 
CGCCCTTCCTGACCCAGGTCCATTCGGAGCAGTGGCAGCTGTCCACCAGCCAGATCCCTGTTCAGCA 
AATGCATCTGTTTGACGTCCACAATTACCCGGACTATGTTTCCTCAGGCGGTGGATTCGGGCCTGCT 
GATGACCATGGTTATGGTGTTTCTTATATCTTCATGGGGGATGGCATGATCACCTTCCACATCTCCA 
GCAaaAaATCaAGCACAAaaACGGATTCCCACAGGCTGGGGCAGCACATTGAGGACGCACTGCTGGA 
TGTGGCCTCCCTGTTCCAGGCGGGACAGCATTTTAAGCGCCGGTTCAGAGGGTCAGGGAAGGAGAAC 
TCCAGGCACAGGTGTGGATTTCTCTCCCGCCAGACTGGGGCCTCCAAGGCCTCAATGACATCCACCG 
ACTTCTGACTCCTTCCAGCAGGCAGCTGGCCTCTCCAAGGAATAAGGGTGAAATTGCCACAGCTGGC 


TGACACAGGACAGGGGCAACTGGTTTGGCAACCCCACATCCAGGCCAATAAAGATGTGTGAGCTGGG 


TGTGTGGTGTCTGCTATGCTCTTGGGCAGGGCAGGGGTAGAAGAGGTAAGGACCAGGGTGGAGGAGG 


ACAGAAGCTCCCATCCATTCCCAGGCCCAGCCAGGGATTCCC 




ORF Start: ATG at 3 1 j ORF Stop: TGA at 2284 





SEQ ID NO: 124 


751 aa |MW at 84918.2kD 


NOV28a, 
CG148102-01 
Protein Sequence 


MAEAHQAVGFRPSLTSDGA1WELSAPVLQEIYL 

FS AIQLAWFLQLDPS LGLMEK I KELLRG VLAAALFASCLWGALI FTLHVALRLLL S YHGWLLEPHGA 

MSSPTKxTOiALTOIFSGRHPl^FSYQRSLPRQPVPSVQDl^^ 

LRLQASLLQWYLRLKSWWASISTYVSDWWEEFV^ 

HALLLYRHI^NRQEIPPVRLMGMRPLCSAQYEKIFNTTRIPGVQKGETIRHLHDSQHVAVFHRGRFF 
RMGTHSRNSLLSPRALEQQFQRILDDPSPACPHEEHlxAALTAAPRGTWAQVRTSLKTQAAEAL 
GAAFFVS LDAE PAGLTREDPAASLDAYAHALL AGRGHDRWFDKSFTL I VF SNGKLGL S VEHSWADC P 
ISGHMWEFTl^TECFQLGYSTIXSHCK^^ 

HWPFSLFGKSFIRRCHLSSDSFIQIALQLAHFRDPQCLALFRVAVDKHQAI.LKAAMS 

AL YIVSRFLHLQSPFLTQWSEQWQLSTSQI PVQQMHLFDVHNYPDYVS SGGGFGPADDHGYGVSYI 

FMGDGMITFHISSKKSSTKTDSHRLGQHIEDALLDVASLFQAGQHFKRRFRGSGKENSRHRCGFLSR 

QTGASKASMTSTDF 



jSEQ ID NO: 125 |2748 bp | 

201 



WO 03/029424 



PCT/US02/31373 



NOV28b, 
CG148102-02 
DNA Sequence 



CGAG AG AC AGGAATCGGGGTTTC TGGGTGAC GGTGXTC' 



CGACCCGGTGGTGGACTCCTTGCACTGGGATTGGACATATGCAAGCGGGAGATTTGGGGCCGGCGCT 



CAAAATCGGGGGGCGGGGGTGGACTCGGGTTTGGACCCCAGGATCCGATCAGCGGACCCTTGATTCA 



ACGTGGGCTCCAGCGTGACA TGGCTGAAGCGCACCAGGCCGTGGGCTTCCGACCCTCGCTGACCTCG 



j'CAGGACTCC: 



CCCGT 



GACGGGGCTGAAGTGGAACTCAGTGCCCCTGTGCTGCAGGAGATCTACCTCTCTGGCCTGCGCTCCT 
GGAAAAGGCATCTCTCACGTTTCTGGAATGACTTTCTCACCGGTGTGTTTCCTGCCAGCCCCCTCAG 
TTGGCTTTTCCTCTTCAGTGCCATCCAGCTTGCCTGGTTCCTCCAGCTGGATCCTTCCTTAGGACTG 
ATGGAGAAGATCAAAGAGTTGC TGCC TGACTGGGGTGGAC AACACC ACGGGCTCCGGGGGGTC CTGG 
CAGCCGCGCTGTTTGCCTCGTGTTTGTGGGGAGCCCTGATCTTCACACTGCACGTGGCCCTGAGGCT 
GCTTCTGTCCTACCACGGCTGGCTTCTTGAGCCCCACGGAGCCATGTCCTCCCCCACCAAGACCTGG 
CTGGCCCTGGTCCGCATCTTCTCTGGCCGCCACCCGATGCTGTTCAGTTACCAGCGCTCCCTGCCAC 
GCCAGCCCGTGCCCTCTGTGCAGGACACCGTGCGCAAGTACCTGGAGTCGGTCCGGCCCATCCTCTC 
CGACGAGGACTTCGACTGGACCGCGGTCCTGGCGCAGGAATTCCTGAGGCTGCAGGCGTCACTGCTG 
CAGTGGTACCTGCGGCTCAAGTCCTGGTGGGCGTCCAATTATGTCAGTGACTGGTGGGAGGAATTTG 
TGTACCTGCGCTCCCGAAATCCGCTGATGGTGAACAGCAACTATTACATGATGGACTTCCTGTATGT 
CACACCCACGCCTCTGCAGGCAGCTCGCGCTGGGAATGCCGTCCATGCCCTCCTCCTGTACCGCCAC 
CGCC TGAACCGC C AGGAGATACC CCCGACTTTGCTGATGGGAATGCGCCCC T TATGCTC TGCCC AGT 
ACGAGAAGATCTTCAACACCACGCGGATTCCAGGGGTCCAAAAAGACTACATCCGCCACCTCCATGA 
CAGCCAACACGTGGCTGTCTTCCACCGGGGCCGATTCTTCCGCATGGGGACCCACTCCCGAAACAGC 
CTGC TTTCC CCGAGAGCCC TGGAGCAGCAGTTTC AGAGAATCCTGGATGATCCCTC ACCGGCC TGCC 
CCCACGAGGAACATCTGGCAGCTCTGACAGCTGCTCCCAGGGGCACGTGGGCCCAGGTGCGGACATC 
CCTGAAGACCCAGGCAGCGGAGGCCCTGGAGGCGGTGGAAGGGGCCGCTTTCTTTGTGTCACTGGAT 
GCTGAGCCCGCGGGGCTCACCAGGGAGGACCCGGCAGCGTCGTTGGATGCCTACGCCCATGCTCTGC 
TGGCTGGCCGGGGCCATGATCGCTGGTTTGACAAATCCTTCACCCTAATCGTCTTCTCTAACGGGAA 
GCTGGGCCTCAGCGTGGAGCACTCCTGGGCCGACTGCCCCATCTCAGGACACATGTGGGAGTTCACT 
C TGG CTACAGAATGCTTTCAGC TGGGC TACTC AAC AGATGGCCACTGCAAGGGGC ACCCGGACCCCA 
CACTACCCCAGCCCCAGCGGCTGCAATGGGACCTTCCAGACCAGATCCACTCCTCCATCTCTCTAGC 
CCTGAGGGGAGCCAAGATCTTGTCTGAAAATGTCGACTGCCATGTCGTTCCATTCTCCCTATTTGGC 
AAGAGCTTCATCCGACGCTGCCACCTCTCTTCAGACAGCTTCATCCAGATCGCCTTGCAACTGGCCC 
ACTTCCGGGACAGGGGTCAATTCTGCCTGACTTATGAGTCGGCCATGACTCGCTTATTCCTGGAAGG 
CCGGACGGAGACGGTGCGGTCTTGCACGAGGGAGGCCTGCAACTTTGTCAGGGCCATGGAGGACAAA 
GAGAAGACGGACCCACAGTGCCTCGCCCTGTTCCGCGTGGCAGTGGACAAGCACCAGGCTCTGCTGA 
AGGCAGCCATGAGCGGGCAGGGAGTTGACCGCCACCTGTTTGCGCTGTACATCGTGTCCCGATTCCT 
CCAC CTGCAGTCGCCCTTCC TGACCC AGGTCC ATTCGGAGCAGTGGCAGC TGTCC ACC AGCC AGATC 
CCTGTTCAGCAAATGCATCTGT TTGACGTCC ACAATT ACCCGGAC TATGTTTC CTC AGGCGGTGGAT 
TCGGGCCTGCTGATGACCATGGTTATGGTGTTTCTTATATCTTCATGGGGGATGGCATGATCACCTT 
CC AC ATCTC CAGCAAAAAATC AAGC AC AAAAACGGATTC CCAC AGGCTGGGGC AGCAC ATTGAGGAC 
GCACTGCTGGATGTGGCCTCCCTGTTCCAGGCGGGACAGCATTTTAAGCGCCGGTTCAGAGGGTCAG 
GGAAGGAGAACTCCAGGCACAGGTGTGGATTTCTCTCCCGCCAGACTGGGGCCTCCAAGGCCTCAAT 
GAC ATCCACCGACTTCTG AC TC CTTCCAGC AGGC AGCTGGC CTC TCCAAGGAATAAGGGTGAAATTG 



CC ACAGCTGGC TGACAC AGGAC AGGGGCAACTGGTTTGGC AAC C CC AC ATCC AGGCAAATAAAGATG 



ORF Start: ATG at 221 



ORF Stop: TGA at 2630 





SEQ ID NO: 126 (803 aa |MW at 90987.8kD 


NOV28b, 
CG148102-02 
Protein Sequence 


MAEAHQAVGFRPSLTSTCAEVELSAPVLQEIYLSGL^ 

AI QLAWFLQLDPSLGLMEKIKELL PDWGGQHHGLRGVLAAALFASCLWGALIFTLHVALRLLLS YHG 
WLLEPHGAMSSPTKTWI^VRIFSGRHPMLFSYQRSLPRQPVPSVQDTVRKYLESVRPILSDEDFDW 
TAVLAQEFLRLQASLLQWYLRLKSVWASNYVSDWTO 

AARAGNAVHALLLYRHRLNRQE IPPTLLMGMR PLCSAQYEKIFNTTRI PGVQKDYIRHLHDSQHVAV 
FHRGRFFRMGTHSRNSLLSPRALEQQFQRILDDPSPACPHEEHLAALTAAPRGTWAQVRTSLKTQAA 
EALEAVEGAAFFVSLDAEPAGLTREDPAASLDAYAHALLAGRGHDRWFDKSFTTjIVFSNGKLGLSVE 
HSWADCPISGHMWEFTLATECFQLGYSTDGHCKGHPDPTLPQPQRLQWDLPDQIHSSISLALRGAKI 
LSENVDCHVVPFSLFGKSFIRRCHLSSDSFIQIALQLAHFR^ 

SCTREACNFVRAMEDKEKTDPQCLALFRVAVDKHQALLKAAMSGQGVDRHLFALYIVSRFLHLQSPF 
LTQVHSEQWQLSTSQ IPVQQMHLFDVHNYPDYVSSGGGFGPADDHGYGVSYIFMGDGMI TFHISSKK 
SSTKTDSHRLGQHIEDALLDVASLFQAGQHFKRRFRGSGKENSRHRCGFLSRQTGASKASMTSTDF 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 28B. 
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Table 28B. Comparison of NOV28a against NOV28b. 


Protein Sequence 


NOV28a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV28b 


1..751 
1..803 


717/806 (88%) 
719/806(88%) 



5 Further analysis of the NOV28a protein yielded the following properties shown in 

Table 28C. 



Table 28C. Protein Sequence Properties NOV28a 


PSort analysis: 


0.7900 probability located in plasma membrane; 0.6400 probability located in 
microbody (peroxisome); 0.3000 probability located in Golgi body; 0.2000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 5 and 6 
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A search of the NOV28a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 28D. 

15 



Table 28D. Geneseq Results for NOV28a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV28a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAY79220 


Human transferase 
TRNSFS-12 - Homo sapiens, 
803 aa. [WO200014251-A2, 
16-MAR-2000] 


1..751 
1..803 


740/806 (91%) 
742/806 (91%) 


0.0 


AAE10322 


Human carnitine 
acyltransferase, 26886 - 
Homo sapiens, 803 aa. 
[WO200166759-A2, 
13-SEP-2001] 


1..751 
1..803 


739/806 (91%) 
742/806 (91%) 


0.0 
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AAW 14438 


Type I carnitine palmitoyl 
transferase-like protein - 
Homo sapiens, 772 aa. 
[JP09009969-A, 
14-JAN-1997] 


p. 

1-711 
1..766 


CTV -UQUi2s 
375/770 (48%) 

495/770 (63%) 


•"J- 1 ^HL a* 1 * 

0.0 


ABG04960 


Novel human diagnostic 
protein #495 1 - Homo 
sapiens, 521 aa. 
[WO200175067-A2, 
ll-OCT-2001] 


224..571 
92..471 


337/381 (88%) 
339/381 (88%) 


0.0 


ABB67527 


Drosophila melanogaster 
polypeptide SEQ ID NO 
29373 - Drosophila 
melanogaster, 780 aa. 
[WO200171042-A2, 
27-SEP-2001] 


1..717 
1..765 


315/775 (40%} 
447/775 (57%) 





In a BLAST search of public sequence datbases, the NOV28a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 28E. 



Table 28E. Pub] 


lie BLASTP Results for NOV28a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV28a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q8TCG5 


Carnitine 

palmitoyltransferase IC - 
Homo sapiens (Human), 803 
aa. 


1..751 
L.803 


740/806 (91%) 
742/806 (91%) 


0.0 


CAC88591 


Sequence 1 from Patent 
WO0166759 - Homo sapiens 
(Human), 803 aa. 


1.751 
1..803 


739/806 (91%) 
742/806 (91%) 


0.0 


AAH29104 


Similar to carnitine 
palmitoyltransferase IC - 
Homo sapiens (Human), 792 
aa. 


L.751 
1..792 


729/806 (90%) 
731/806(90%) 


0.0 


P32198 


Carnitine 

O-palmitoyltransferase I, 
mitochondrial liver isofonn 
(EC 2.3.1.21) (CPT I) 
(CPTI-L) - Rattus norvegicus 
(Rat), 773 aa. 


1..710 
L.765 


394/768 (51%) 
524/768 (67%) 


0.0 
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1 ' Pi 


381/748 (50%) 




Q9BWK0 


Similar to carnitine 


1..690 


0.0 


palmitoyltransferase I, liver - 
Homo sapiens (Human), 756 
aa. 


1..745 


510/748 (67%) 





PFam analysis predicts that the NOV28a protein contains the domains shown in the 
Table 28F. 



Table 28F. Domain Analysis of NOV28a 


Pfam Domain 


NOV28a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Cam_acyltransf 


162..708 


208/680(31%) 
437/680 (64%) 


1.5e-167 



Example 29. 

The NOV29 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 29A. 



Table 29A. NOV29 Sequence Analysis 




SEQIDNO:127 |l776bp j 


NOV29a, 
CG148431-01 
DNA Sequence 


ACTAAAGCCTGCAGAGACCTCTGAAGGAAAACCTGTCCCGGGCTCTGTCACTTCACACCCATGGCTA 


ACCCTGGAGGTGGTGCTGTTTGCAACGGGAAACTTCACAATCACAAGAAACAGAGCAATGGCTCACA 
AAGCAGAAACTGCACAAAGAATGGAATAGTGAAGGAAGCCCAGCAAAATGGGAAGCCACATTTTTAT 
GATAAGCTCATTGTTGAATCGTTTGAGGAAGCACCCCTTCATGTTATGGTTTTCACTTACATGGGAT 
ATGGAATTGGAACCC TGTTTGGCTATCTCAGAGACTTTTTAAGAAAC TGGGGAATAGAAAAATGC AA 
CGCAGCTGTGGAAAGAAAAGAACAAAAAGATTTTGTGCCACTGTATCAAGACTTTGAAAATTTTTAT 
ACAAGAAACCTTTACATGCGAATCAGAGACAACTGGAACCGGCCCATCTGCAGTGCCCCAGGGCCTC 
TGTTTGATTTGATGGAGAGGGTATCAGACGACTATAACTGGACGTTTAGGTTTACTGGAAGAGTCAT 
CAAAGATGTCATCAACATGGGCTCCTATAACTTCCTTGGTCTTGCAGCCAAGTATGATGAGTCTATG 
AGGACAATAAAGGATGTTTTAGAGGTGTATGGCACAGGCGTGGCCAGCACCAGGCATGAAATGGGCA 
C C TTGG ATAAGCAC AAGGAGTTGGAGGACCTTGTGGCTAAGTTCC TGAATGTGGAAGC AGCTATGGT 
CTTTGGGATGGGATTCGCAACTAACTCAATGAATATCCCAGCATTAGTTGGAAAGGGATGCCTCATT 
TTAAGTGATGAGTTAAACCACACATCGCTTGTGCTTGGGGCCCGACTCTCAGGTGCAACCATAAGAA 
TC TTCAAACAC AACAAC AC AC AAAGCCTAGAGAAGCTCCTGAGAGATGC TGTCATCTATGGCCAGCC 
TCGAACCCGCAGAGCTTGGAAAAAGATTCTCATCCTGGTGGAGGGTGTCTACAGCATGGAAGGTTCC 
ATCGTGCATCTGCCCCAGATCATAGCTCTAAAGAAGAAATACAAGGCTTACCTCTACATAGATGAAG 
CTCACAGTATTGGGGCCGTGGGCCCAACCGGCCGGGGTGTCACGGAGTTCTTTGGACTAGACCCTCA 
TGAAGTTGATGTGCTCATGGGCACATTCACCAAAAGTTTTGGAGCTTCAGGAGGTTACATAGCTGGA 
AGGAAGGACCTCGTGGATTATTTACGGGTTCACTCGCATAGTGCTGTTTATGCTTCATCCATGAGCC 
CACCGATAGCAGAGCT^AATCATCAGATCACTAAAACTTATCATGGGACTGGATGGGACCACTCAAGG 
GCTGCAGAGAGTACAGCAACTTGCGAAAAACACAAGATACTTCAGACAAAGACTGCAGGAAATGGGA 
TTCATTATCTATGGCAATGAGAATGCTTCTGTTGTTCCTCTGCTTCTTTATATGCCTGGTAAAGTAG 
CGGCTTTTGGAAGGCATATGCTAGAGAAAAAAATTGGAGTGGTGGTCGTGGGATTTCCAGCCACTCC 
CCTCGCAGAAGCTCGGGCTCGGTTTTGTGTTTCAGCGGCACATACCCGGGAGATGTTAGACACGGTT 
TTAGAAGCTCTTGATGAAATGGGTGATCTCTTGCAACTGAAATATTCCCGGCACAAGAAGTCAGCAC 
GTCCTGAGCTCTATGATGAGACGAGCTTTGAACTCGAAGATTAAGTTTCCTGGTCCTGAATGACACA 
TAAAGACTTTGCGAGAAAGACCTCCCTCCTTGCC 




ORF Start: ATG at 61 | 


ORFStop: TAAat 1717 
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SEQ ID NO: 128 |552 aa |MW at 62048.9kD 


NOV29a, 
CG148431-01 
Protein Sequence 

. 


MANPGGGAVCNGKLHNHKKQSNGSQSRNCTKNGIVKEAQQNGKPHFYDKLIVESFE 
MGYGIGTLFG YLRDFLRTOG IEKCNAAVERKEQKDFVPLYQr^^ IRDNWNRPICSAP 
GPLFDLMERVSDDYNWTFRFTGRV^ 

MGTLDKHKELEDLVAKFLNVEAAMVFGMGF ATNSMNI PAL VGKGCL I LSDELNHTSLVLGARL S GAT 

irIfkhnntqslekllrdaviygqprtrrawkkililvegws^ 

DEAHS I GAVG PTGRG VTEFFGLDPHEVDVLMGTFTKS FG A SGG YI AGRKDL VDYLRVH SHS AVYAS S 
MSPPIAEQIIRSLKLIMGLDGTTQGLQRVQQLAKNTRYFRQRLQEMGFIIYGNENASWPLLLYMPG 
KVAAFARHMLEKKIGVVWGFPATPLAEARARFCVSAAHTREMLDTVLEALDEMGDLLQLKYSRHKK 
SARPELYDETSFELED 





SEQ ID NO: 129 


1492 bp j 


NOV29b, 
CG148431-02 
DNA Sequence 


CACCGGATCCACCATGGCTAACCCTGGAGGTGGTGCTGTTTGCAACGGGAAACTTCACAATrArAAC; 

AAACAGAGCAATGGCTCACAAAGCAGAAACTGCACAAAGAATGGAATAGTGAAGGAAGCCCAGGATT 

TTGTGCCACTGTATCAAGACTTTGAAAATTTTTATACAAGAAACCTTTACATGCGAATCAGAGACAA 

CTGGAACCGGCCCATCTGCAGTGCCCCAGGGCCTCTGTTTGATGTGATGGAGAGGGTATCGGACGAC 

TATAACTGGACGTTTAGGT TTAC TGGAAGAGTCATCAAAG ATGTC ATC AAC ATGGGCTCC TATAAC T 

TCCTTGGTCTTGCAGCCAAGTATGATGAGTCTATGAGGACAATAAAGGATGTTTTAGAGGTGTATGG 

C AC AGGCGTGGCC AGCAC C AGGC ATGAAATGGGCACC TTGGATAAGC AC AAGGAGTTGGAGGACC TT 

GTGGCTAAGTTCCTGAATGTGGAAGCAGCTATGGTCTTTGGGATGGGATTCGCAACTAACTCAATGA 

ATATCCCAGCATTAGTTGGAAAGGGATGCCTCATTTTAAGTGATGAGTTAAACCACACATCGCTTGT 

GCTTGGGGC CCGAC TCTC AGGTGCAAC CATAAG AATCTTC AAAC ACAACAACACACAAAGCCTAGAG 

AAGCTCCTGAGAGATGCTGTCATCTATGGCCAGCCTCGAACCCGCAGAGCTTGGAAAAAGATTCTCA 

TCCTGGTGGAGGGTGTCTACAGCATGGAAGGT TCC ATCGTGC ATCTGCCCC AGATCATAGCTC TAAA 

GAAGAAATACAAGGCTTAC CTCTACATAGATG AAGCTC ACAGTATTGGGGC CGTGGGCCC AACCGGC 

CGGGGTGTCACGGAGTTCTTTGGACTAGACCCTCATGAAGTTGATGTGCTCATGGGCACATTCACCA 

AAAGTTTTGGAGCTTCAGGAGGTTACATAGCTGGAAGGAAGGACCTCGTGGATTATTTACGGGTTCA 

CTCGC ATAGTGC TGTTTATGCTTC ATC C ATGAGCCCAC CG ATAGC AGAGCAAATCATCAG ATCAC TA 

AAACTTATCATGGGACTGGATGGGACCACTCAAGGGCTGCAGAGAGTACAGCAACTTGCGAAAAACA 

CAAGATACTTCAGACAAAGACTGCAGGAAATGGGATTCATTATCTATGGCAATGAGAATGCTTCTGT 

TGTTCCTCTGC TTC TTTATATGCC TGGTAAAGTAGCGGCTTTTGC AAGGCATATGC TAGAGAAAAAA 

ATTGGAGTGGTGGTCGTGGGATTTCCAGCCACTCCCCTCGCAGAAGCTCGGGCTCGGTTTTGTGTTT 

CAGCGGCACATACCCGGGAGATGTTAGACACGGTTTTAGAAGCTCTTGATGAAATGGGTGATCTCTT 

GCAACTGAAATATTCCCGGGACAAGAAGTCAGCACGTCCTGAGCTCT 

CTC GAAGATCTCGAGGGC 




ORF Start: ATG at 14 


joRFStop: at 1484 





SEQ ID NO: 130 j490aa 


MWat54766.5kD 


NOV29b, 
CG148431-02 
Protein Sequence 


MAJtfPGGGAVCNGKLHNHKKQSNGSQSRNCTKNGIVKEAQD^ 

I CS APGPLFDVMERVSDDYNWTFRFTGRVI KDVINMGSYNFLGLAAKYDE SMRTIKDVLEVYGTGVA 
S TRHEMGTLDKHKELEDLVAKFLNVEAAMVFGMGFATNSMNI P AXVGKGCIi I LSDELNHTSL VLGAR 
LSGATIRIFKHNNTQSLEKLLRDAVIYGQPRTRRAWKKILILVEGVYSMEGS IVHLPQI IALKKKYK 
AYL YIDEAHS IGAVG PTGRGVTEFFGLDPHEVDVLMGTF TKS FG ASGGY I AGRKDLVDYLRVHSHS A 
VYASSMSPPIAEQI IRSLKL IMGLDGTTQGLQRVQQLAKNTRYFRQRLQEMGFI IYGNENAS WPLL 
LYMPGKVAAFAJRHMLEKKIGTWWGFPATPLA 
SRHKKSARPELYDETSFELED 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 29B. 



206 



WO 03/029424 



PCT/US02/31373 



Table 29B. Comparison of NOV29a against NOV29b. 


Protein Sequence 


NOV29a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV29b 


98..552 
36..490 


438/455 (96%) 
440/455 (96%) 



5 Further analysis of the NOV29a protein yielded the following properties shown in 

Table 29C. 



Table 29C Protein Sequence Properties NOV29a 


PSort analysis: 


0.4761 probability located in microbody (peroxisome); 0.3000 probability 
located in nucleus; 0.2077 probability located in lysosome (lumen); 0.1000 
probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 
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A search of the NOV29a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 29D. 

15 



Table 29D. Geneseq Results for NO V29a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date) 


NOV29a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAE22153 


Human TRNFR-15 protein - 
Homo sapiens, 552 aa. 
[WO200226950-A2, 
04-APR-2002] 


1..552 
1..552 


551/552(99%) 
552/552(99%) 


0.0 


AAG73598 


Human colon cancer antigen 
protein SEQ ID NO:4362 - 
Homo sapiens, 391 aa. 
[WO200122920-A2, 
05-APR-2001] 


201..549 
38..387 


269/352(76%) 
316/352(89%) 


e-158 


ABB60160 


Drosophila melanogaster 
polypeptide SEQ ID NO 
7272 - Drosophila 
fnelanoyaster. 597 aa - 


54..S43 
114..597 


256/491 (52%) 
350/491 (71%) 


e-151 
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* 


[WO200171042-A2, 
27-SEP-2001] 











riunian serine 
palmitoyltransferase 
(SPT)-like enzyme #2 - 
Homo sapiens, 230 aa. 
[WO200224884-A2, 
28-MAR-2002] 


An OIA 
4/..Z/0 

1..230 


230/230 (99%) 


e-133 


AAY32003 . 


Rice serine 

palmitoyltransferase Lcb2 
subunit - Oryza sativa, 489 
aa. [WO9949021-A1, 
30-SEP-1999] 


59..541 
5..483 


237/485 (48%) 
333/485 (67%) 


e-133 



In a BLAST search of public sequence datbases, the NOV29a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 29E. 
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Table 29E. Public BLASTP Results for NOV29a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV29a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q9UGB6 


DJ718P11.1.1 (Novel class II 
aminotransferase similar to 
serine palmotyltransferase 
(Isoform 1)) - Homo sapiens 
(Human), 414 aa (fragment). 


102..515 
1..414 


414/414 (100%) 
414/414(100%) 


0.0 


015270 


Serine palmitoyltransferase 2 
(EC 2.3.1.50) (Long chain 
base biosynthesis protein 2) 
(LCB 2) 

(Serine-palmitoyl-CoA 
transferase 2) (SPT 2) - 
Homo sapiens (Human), 562 
aa. 


7..549 
18..558 


383/546 (70%) 
449/546 (82%) 


0.0 


P97363 


Serine palmitoyltransferase 2 
(EC 2.3. 1.50) (Long chain 
base biosynthesis protein 2) 
(LCB 2) 

(Serine-palmitoyl-CoA 
transferase 2) (SPT 2) - Mus 
musculus (Mouse), 560 aa. 


7..549 
18..556 


379/546 (69%) 
449/546(81%) 


0.0 


JC5180 


serine C-palmitoyltransferase 
(EC 2.3. 1 .50) Lcb2 chain - 
mouse, 560 aa. 


7..549 
18..556 


378/546 (69%) 
449/546 (82%) 


0.0 
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054694 


Serine palmitoyltransferase 2 


7..S49 




0.0 


• 


(EC 2.3.1.50) (Long chain 

base hioivnthesis nrotein 2^ 
(LCB2) 

(Serine-palmitoyl-CoA 
transferase 2) (SPT 2) - 
Cricetulus griseus (Chinese 
hamster), 560 aa. 


18..556 


446/546 (81%) 





PFam analysis predicts that the NOV29a protein contains the domains shown in the 
Table 29F. 
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Table 29$. Domain Analysis of NOV29a 


Pfara Domain 


NOV29a Match Region 


Identifies/ 
Similarities 
for the Matched 
Region 


Expect Value 


aminotran_l_2 


193..521 


71/363 (20%) 
237/363 (65%) 


2.6e-29 



Example 30. 

10 The NOV30 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 30A. 



Table 30A. NOV30 Sequence Analysis 




SEQIDNO: 131 


576 bp | 


NOV30a, 
CG148888-01 
DNA Sequence 


TGAGCCAGCCC CGGATGACCC TGCGACCTGGAACAATGCGGCTGGCCTGCATGTTCTCTTCCATCCT 
GCTGTTCGGAGCTGCAGGCCTCCTCCTCTTCATCAGCCTGCAGGACCCTACGGAGCTCGCCCCCCAG 
CAGGTGCCAGGAATAAAGTTCAACATCAGGCCAAGGCAGCCCCACCACGACCTCCCACCAGGCGGCT 
CTGGGGTGCGTTTTCCCGAGTTCGTCCAGTACCTGCTGGACGTGCACCGGCCCGTGGGGATGGACAT 
TCACTGGGACCATGTCAGCCGGC TCTGCAGCCCCTGCC TCATCGACTACGATTTCGTAGGCAAGTTC 
GAGAGCATGGAGGACGATGCCAACTTCTTCCTGAGCCTCATCCGCGCGCCGCGGAACCTGACCTTCC 
CCCGGTTCAAGGACCGGCACTCGCAGGAGGCGCGGACCACAGCGAGGATCGCCCACCAGTACTTCGC 
CCAACTCTCGGCCCTGCAAAGGCAGCGCACCTACGACTTCTACTACATGGATTACCTGATGTTCAAC 
TATTCCAAGCCCTTTACAGATCTGTACTGAGGGGCGCCGC 




ORFStart:ATGatl5 


|ORFStop: TGA at 564 



15 
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SEQBDNO:132 jl83aa 


MWat21347.3kD 


NOV30a, 
CG148888-01 
Protein Sequence 


MTLRPGTMRLACMFSSILLFGAAGLLLFISLQDPTELAPQQVPGIKFNIRPRQPHHDLPPGGSGVRF 
PEFVQYLLDVHRPVGMDIHWDHVSRLCSPCLIDYDFVGKFESMEDDANFFLSLIRAPRNLTFPRFKD 
RHSQEARTTARIAHQYFAQLSALQRQRTYDFYYMDYLI4FNYSKPFTDLY 



Further analysis of the NOV30a protein yielded the following properties shown in 
Table 30B. 



Table 30B. Protein Sequence Properties NOV30a 


PSort analysis: 


0.8650 probability Ipcated in lysosome (lumen); 0.8200 probability located in 
outside; 0.3657 probability located in microbody (peroxisome); 0.1000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 38 and 39 



A search of the NOV30a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 30C. 
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Tabic 30C. Geneseq Results for NOV30a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV30a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABB53266 


Human polypeptide #6 - 
Homo sapiens, 424 aa. 
[WO200181363-A1, 
01-NOV-2001] 


62..183 
303..424 


121/122 (99%) 
121/122 (99%) 


4e-69 


ABB53265 


Human polypeptide #5 - 
Homo sapiens, 628 aa. 
[WO200181363-A1, 
01-NOV-2001] 


62..183 
507.."628 


121/122 (99%) 
121/122 (99%) 


4e-69 


AAE15437 


Human drug metabolising 
enzyme (DME)-4 - Homo 
sapiens, 396 aa. 
[WO200179468-A2, 
25-OCT-2001] 


62..183 
275.-396 


121/122 (99%) 
121/122 (99%) 


4e-69 


AAB85083 


Human interleukin-6 (IL-6) 
like polypeptide - Homo 
sapiens, 171 aa. 
[WO200142484-A1, 
14-JUN-2001] 


62..183 
50..171 


121/122 (99%) 
121/122 (99%) 


4e-69 


AAM24429 


Murine EST encoded protein 
SEQ ID NO: 1954 - Mus 
musculus, 424 aa. 
[WO200154477-A2, 
02-AUG-2001] 


62..183 
303..424 


121/122 (99%) 
121/122 (99%) 


4e-69 



In a BLAST search of public sequence datbases, the NOV30a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 30D. 



Table 30D. Public BLASTP Results for NOV30a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV30a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q9H3N2 


GalNAc 4-sulfotransferase 
(GalNAc-4-O-sulfotransferase 1) 
(Carbohydrate (N-acetylgalactosamine 
4-0) sulfotransferase 8) (Hypothetical 
48.8 kDa protein) - Homo sapiens 
(Human), 424 aa. 


62.. 183 
303..424 


121/122(99%) 
121/122 (99%) 


le-68 
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Q9H2A9 


N-acetylgalactosamine-4-O-sulfotrans 
ferase - Homo sapiens (Human), 424 

3.3.. 


62.. 183' 
303..424 


F II .'"At-H It It" I 

i 4 s Tj.«i flu* 1 ltn<». i* 

120/122 (98%) 
120/122 (98%) 


M^Jt to^J» „J,,I* J> 

4e-68 


Q9BXH4 


GalNAc-4-sulfotransferase 2 - Homo 
sapiens (Human), 443 aa. 


62..179 
325..442 


77/118(65%) 
95/118(80%) 


le-44 


Q9BXH3 


GalNAc-4-sulfotransferase 2 - Homo 
sapiens (Human), 358 aa. 


62..179 
240..357 


77/118(65%) 
95/118(80%) 


le-44 


Q9BZW9 


N-acetylgalactosamine 
4-O-sulfotransferase 2 GalNAc4ST-2 
- Homo sapiens (Human), 438 aa. 


62..179 
320..437 


77/118(65%) 
95/118(80%) 


le^4 



Example 31. 

The NOV31 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 31 A. 



Table 31A. NOV31 Sequence Analysis 





SEQE)NO:133 [2325 bp 


NOV31a, 
CG149008-01 
DNA Sequence 


C CC AGGCCGGACAAGCGTCCCGAAAGCCCCGGGAGAGAC TAAGAAGC AATCC TCCC ACGCGCTTTC T 


CCCACCCTCGGGCCACTGAGACGGAGGGACAGAGGGCCGCCCTCGCGCGGCCGAGGCCCCGCCTCCC 


GCTCGCCCGCCCGCGCCTCCAGCGGAAGCCGGAAGCAAAAGCGGGTCCTGCTAGCCCCGCGGCTCCG 


AACTCGGTGGTCCTGG7VAGCTCCGCAGGATGGGGGAGAAGATGGCGGAAGAGGAGAGGTTCCCCAAT 


ACAACTCATGAGGGTTTCAATGTCACCCTCCACACCACCCTGGTTGTCACGACGAAACTGGTGCTCC 
CGACCCCTGGCAAGCCCATCCTCCCCGTGCAGACAGGGGAGCAGGCCCAGCAAGAGGAGCAGTCCAG 
CGGCATGACCATTTTCTTCAGCCTCCTTGTCCTAGCTATCTGCATCATATTGGTGCATTTACTGATC 
CGATACAGATTACATTTCTTGCCAGAGAGTGTTGCTGTTGTTTCTTTAGGTATTCTCATGGGAGCAG 
TTATAAAAATTATAGXGTTTAAAAAACTGGCGAATTGGAAGGAAGAAGAAATGTTTCGTCCAAACAT 
GTTTTTCC TCCTCCTGCTTCCC C CT ATTATCTTTGAGTC TGG ATATTCATTAC ACAAGGTGAGACTC 
AGGCACAGATTGGGTAACTTCTTTCAAAATATTGGTT^ 

CAATCTCCGCTTTTGTAGTAGGTGGAGGAATTTATTTTCTGGGTCAGGCTGATGTAATCTCTAAACT 

CAACATGACAGACAGTTTTGCGTTTGGCTCCCTAATATCTGCTGTCGATCCAGTGGCCACTATTGCC 

ATTTTCAATGCACTTCATGTGGACCCCGTGCTCAACATGCTGGTCTTTGGAGAAAGTATTCTCAACG 

ATGCAGTCTCCATTGTTCTGACCAACACAGCTGAAGGTTTAACAAGAAAAAATATGTCAGATGTCAG 

TGGGTGGCAAACATTTTTACAAGCCCTTGACTACTTCCTCAAAATGTTCTTTGGCTCTGCAGCGCTC 

GGCACTCTCACTGGCTTAATTTCTGCATTAGTGCTGAAGCATATTGACTTGAGGAAAACGCCTTCCT 

TGGAGTTTGGCATGATGATC^TTTTTGCTTATCTGCCTTATGGGCTTGCAGAAGGAATCTCACTCTC 

AGGCATCATGGCCATCCTGTTCTCAGGCATCGTGATGTCCCACTACACGCACCATAACCTCTCCCCA 

GTCACCCAGATCCTCATGCAGCAGACCCTCCGCACCGTGGCCTTCTTATGTGAAACATGTGTGTTTG 

CATTTCTTGGCCTGTCCATTTTTAGTTTTCCTCACAAGTTTGAAATTTCCTTTGTCATCTGGTC 

AGTGCTTGTACTATTTGGCAGAGCGGTAAACATTTTCCCTCTTTCCTACCTCCTGAATTTCTTCCGG 

GATCATAAAATCACACCGAAGATGATGTTCATCATGTGGTTTAGTGGCCTGCGGGGAGCCATCCCCT 

ATGCCCTGAGCCTACACCTGGACCTGGAGCCCATGGAGAAGCGGCAGCTCATCGGCACCACCACCAT 

CGTCATCGTGCTCTTCACCATCCTGCTGCTGGGCGGCAGCACCATGCCCCTCATTCGCCTCATGGAC 

ATC G AGGAC G C C AAG GC AC ACCGC AGGAACAAG AAGGAC GT C AACC TC AGC AAGAC T G AG AAG ATGG 

GCAACACTGTGGAGTCGGAGCACCTGTCGGAGCTCACGGAGGAGGAGTACGAGGCCCACTACATCAG 

GCGGCAGGACCTTAAGGGCTTCGTGTGGCTGGACGCCAAGTACCTGAACCCCTTCTTCACTCGGAGG 

CTGACGCAGGAGGACCTGCACCACGGGCGCATCCAGATGAAAACTCTCACCAACAAGTGGTACGAGG 

AGGT ACGC C AGGGC C CC TCCGGCTCCGAGGACG ACG AGCAGGAGCTGCTCTGACGCC AGGTGCC AAG 

GC TTC AGGC AGGCAGGC CC AGG ATGGGCGT TTGCTGCGC AC AGACACTC AGCAGGGGCCTCGCAGAG 


ATGCGTGCATCCAGCAGCCCCTTCAAGACATAAGAGGGCGGGGCGAGGTACTGGCTGCAGAGTCGCC 


TTAGTCCAGAACCTGACAGGCCTCTGGAGCCAGGCGACTTCTTGGGAAACTGTCATCTCCCGACTCC 


TCCCTGAGCCAGCCTCCGCTCAGTGTGGCTCCTCAGCCCACAGAGGGGAGGGAGCATGGGGCCAGGT 


GCCAGTCATCTGTGAAGCTAGGGCGCCTACCCCCCCACCCGGAGGAC 




ORF Start: ATG at 230 | |ORF Stop: TGA at 1994 
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SEQ ID NO: 134 J>88 aa 


MWat66297.1kD 


NOV31a, 
CG149008-01 
Protein Sequence 


MGEKMAEEERFPOTTHEGFNVTLHTTLVOT^ 

VLAI C I ILVHLL IRYRLHFLPESVAWSLGILMGAVIKIIEFKKLANWKEEEMFRPNMFFLLLLPPI 
IFESGYSLHKTOLRHTLGNFFQNIGSITLFAVFGTAISAFVVGGGIYFLGQADVISKLNMTDSFAFG 
SLISAVDPVATIAIF]^HVDPVLI^WGESI]^AVSI^ 

DYFLKMFFGSAALGTLTGLI SALVLKHIDLRKTPSLEFGMMI IFAYLPYGLAEGISLSGIMAI LFSG 
IVMSHYTHHNLSPVTQILMQQTLRWAFLCETCVFAFMLSIFSFPHKFEISFVIWCIVLVLFGRAV 
NIFPLSYLLNFFRDHKITPKMMFIMWFSGLRGAIPYALSLHLDLEPMEKRQLIGTTTIVIVLFTILL 
LGGSTMPLIRLMDIEDAKAHRRNKKDVr^ 

LDAKYLNPFFTRRLTQEDLHHGRIQI^TLTNKWYEEVRQGPSGSEDDEQELL | 



5 

Further analysis of the NOV31a protein yielded the following properties shown in 
Table 31B. 



Table 31B. Protein Sequence Properties NOV31a 


PSort analysis: 


0.8000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3000 probability located in endoplasmic reticulum (membrane); 
0.3000 probability located in microbody (peroxisome) 


SignalP analysis: 


Cleavage site between residues 40 and 41 



10 

A search of the NOV31a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 31C. 
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Table 31C. Geneseq Results for NOV31a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV31a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


ABG61535 


Human transporter and ion 
channel, TRICH5, Incyte ID 
7476938CDl-Homo 
sapiens, 671 aa. 
[WO200240541-A2, 
23-MAY-2002] 


1..588 
91. .671 


581/588 (98%) 
581/588 (98%) 


0.0 


AAM24062 


Human EST encoded protein 
SEQ ID NO: 1587 -Homo 
sapiens, 315 aa. 
[WO200154477-A2, 
02-AUG-2001] 


274.588 
1..315 


315/315 (100%) 
315/315 (100%) 


0.0 
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AAB29621 


Cat flea HMT Na/H 

transporter, SEQ ID 

NO: 1868 - Ctenocephalides 

felis, 608 aa. 

[WO200061621-A2, 

19-OCT-2000] 


- -t* 

8.. 5 84 
33..602 


TT b""^ ^ lt~ * 

329/585 (56%) 
416/585 (70%) 


M.»<if .Jkm *««J* J** 

e-175 


ABB59364 


Drosophila melanogaster 
polypeptide SEQ ID NO 
4884 - Drosophila 
melanogaster, 649 aa. 
[WO200171042-A2, 
27-SEP-2001] 


44..587 
86..63S 


310/562 (55%) 
399/562 (70%) 


e-170 


AA014196 


Human transporter and ion 
channel TRICH-13 - Homo 
sapiens, 631 aa. 
[WO200204520-A2, 
17-JAN-2002] 


117..547 
125..542 


166/439 (37%) 
253/439 (56%) 


2e-72 


In a BLAST search of public sequence datbases, the NOV31a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 3 ID. 


Table 31D. Public BLASTP Results for NOV31a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV31a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


BAA76783 


KIAA0939 protein - Homo 
sapiens (Human), 595 aa 
(fragment). 


1..588 
15..595 


581/588 (98%) 
581/588 (98%) 


0.0 


Q8R4D1 


Na-H exchanger isoform 
NHE8 - Mus musculus 
(Mouse), 576 aa. 


S..587 
1..575 


556/583 (95%) 
565/583 (96%) 


0.0 


Q9Y507 


DJ963K23.4 (Continues in 
dJ1041C10(AL162615))- 
Homo sapiens (Human), 437 
aa (fragment). 


152..588 
1..437 


437/437 (100%) 
437/437 (100%) 


0.0 


Q9Y2E8 


KIAA0939 protein - Homo 
sapiens (Human), 411 aa 
(fragment). 


182..588 
5..411 


405/407 (99%) 
406/407 (99%) 


0.0 


AAH34508 


Hypothetical protein - Mus 
musculus (Mouse), 388 aa 
(fragment). 


209..587 
9..387 


366/379 (96%) 
374/379 (98%) 


0.0 
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PFam analysis predicts that the N0V31a protein con\aifis''Ae"'daMffiis ; sh0^iWe" 
Table 31E. 



Table 31E. Domain Analysis of NOV31a 


Pfam Domain 


NOV31a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Na_HJExchanger 


62..485 


141/465 (30%) 
345/465 (74%) 


3.1e-98 



Example 32. 

The NOV32 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 32A. 



Table 32A. NOV32 Sequence Analysis 




SEQ ID NO: 135 f 367 b P 1 


NOV32a, 
CG149350-01 
DNA Sequence 


ATGGCGGGGAGAAGGAAGCTCATCGCAGTGATCAGAGACAAGGACACGGTGACTGGTTTCCTGCTGG 
GCAGCATAGGGGAGCTTAACAAGAACTGCCACCCCAATTTCCTGGTGGTGGAGAAGGATACGACCAT 
C AATGAG ATCGAAGAC AC TTTCCGGC AATTTCT AAACCGGGATGACAC TGGCATC ATCCTC ATCAAC 
CAGTACATCGC AGAGATGGTGCAGC ATGCCC TGGAC ACCC ACC AGCAC TC TATCCCTACTGTCCTGG 
AGATCCCCTCCAAGGAGCACCCATATGAGGACGCCAAGGACTCCACCCTGCGGAGGGCCAGGGGCAT 
GTTCACTGCCGAAGACCTGTGCTAGGGTCTTT 




ORF Start: ATG at 1 f jORF Stop: TAG at 358 





SEQ ID NO: 136 |l!9aa 


MW at 13566.3kD 


NOV32a, 
CG14935(K>1 
Protein Sequence 


MAGRRKL I AV IRDKDTVTGFLIiG S IGELNKNCHPNFL WEKDTTINEI EDTFRQFLNRDDTG 1 1 L IN 
QYIAEMVQHALDTHQHSIPTVLEIPSKEHPYEDAKDSTLRRARGMFTAEDLC 





SEQ ID NO: 137 (367 bp j 


NOV32b, 
CG149350-02 
DNA Sequence 


ATGGCGGGGAGAAGGAAGCTCATCGCAGTGATCAGAGACAAGGACACGGTGACTGGTTTCCTGCTGG 
GCAGCATAGGGGAGCTTAACAAGAACTGCCACCCCAATTTCCTGGTGGTGGAGAAGGATACGACCAT 
CAATGAGATCGAAGACACTTTCCGGCAATTTCTAAACCGGGATGACACTGGCATCATCCTCATCAAC 
CAGTACATCGCAGAGATGGTGCAGCATGCCCTGGACACCCACCAGCACTCTATCCCTACTGTCCTGG 
AGATCCCCTCCAAGGAGCACCCATATGAGGACGCCAAGGACTCCACCCTGCGGAGGGCCAGGGGCAT 
GTTCACTGCCGAAGACCTGTGCTAGGGTCTTT 




ORF Start: ATG at 1 j JORF Stop: TAG at 358 
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SEQIDNO:138 . jll9aa 


MW at 13566.3kD 


NOV32b, 
CG149350-02 
Protein Sequence 


MAGRRKLIAVIRDKDTVTGFLLGSIGELNKNCHPNFLVVEKDTTINEIEDTFRQFLNRDDTGIILIN 
QYIAEMVQHALDTHQHSIPTVLEIPSKEHPYEDAKDSTLRRARGMFTAEDLC 



5 

Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 32B. 

10 



Table 32B. Comparison of NOV32a against NOV32b. 


Protein Sequence 


NOV32a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV32b 


1..119 
1..119 


119/119(100%) 
119/119(100%) 



Further analysis of the NOV32a protein yielded the following properties shown in 
Table 32C. 
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Table 32C. Protein Sequence Properties NOV32a 


PSort analysis: 


0.4852 probability located in mitochondrial matrix space; 0.4500 probability 
located in cytoplasm; 0.1957 probability located in mitochondrial inner 
membrane; 0.1957 probability located in mitochondrial intermembrane space 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV32a protein against the Geneseq database, a proprietary 
20 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 32D. 



Table 32D. Geneseq Results for NOV32a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV32a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 
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AAW27337 


Human vacuolar ATPase 14 
kDa subunit hV-14B - Homo 
sapiens, 119 aa. 
[JP09168390-A, 
30-JUN-1997] 


. irv i 

P 1 

1..118 
1-118 


105/118(88%) 
108/118(90%) 


.j< JL _1 > 4 
2e-54 


AAW27336 


Human vacuolar ATPase 14 
kDa subunit hV-14A - Homo 
sapiens, 119 aa. 
[JP09168390-A, 
30-JUN-1997] 


1..118 
1..118 


104/118(88%) 
107/118(90%) 


8e-54 




lylUoUpilJld. lllClalHJgaou^i 

polypeptide SEQIDNO 
15576 - Drosophila 
melanogaster, 124 aa. 
[WO200171042-A2, 
27-SEP-2001] 


6..118 
10.. 122 


71/113(62%) 
91/113(79%) 


2e-38 


ABB57798 


Drosophila melanogaster 
polypeptide SEQ ID NO 186 
- Drosophila melanogaster, 
124 aa. [WO200171042-A2, 
27-SEP-2001] 


6..114 
10..118 


58/109 (53%) 
84/109 (76%) 


7e-29 


AAG35989 


Zea mays protein fragment 
SEQIDNO: 44042 -Zea 
mays subsp. mays, 130 aa. 
[EP1033405-A2, 
06-SEP-2000] 


1-118 
1..125 


56/125 (44%) 
85/125 (67%) • 


le-27 


In a BLAST search of public sequence datbases, the NOV32a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 32E. 


Table 32E. Public BLASTP Results for NOV32a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV32a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P50408 


Vacuolar ATP synthase 
subunit F (EC 3.6.3.14) 
(V-ATPaseF subunit) 
(Vacuolar proton pump F 
subunit) (Vr ATPase 14 kDa 
subunit) - Rattus norvegicus 
(Rat), 119aa. 


1..118 
1.-118 


104/118(88%) 
108/118 (91%) 


le-53 
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Q 16864 


Vacuolar ATP synthase 
su burnt Jr (bC 3.0.3.14) 
(V-ATPase F subunit) 
(Vacuolar proton pump F 
subunit) (V-ATPase 14 kDa 
subunit) - Homo sapiens 
(Human), 119 aa. 


« 

1-118 

1 1 1 Q 


lb. '8 / a-fcfrci' . 
104/118 (88%) 

1U//1 lo \y\)7o) 


J*L -/ « 

2e-53 


Q9D1K2 


1110004G16Rik protein - 
Mus musculus (Mouse), 1 19 
aa. 


1..118 
1..118 


103/118(87%) 
108/118(91%) 


5e-53 


Q28029 


Vacuolar ATP synthase 
subunit r (bC 3.6.5.14) 
(V-ATPase F subunit) 
(Vacuolar proton pump F 
subunit) (V-ATPase 14 kDa 
subunit) - Bos taurus 
(Bovine), 1 10 aa (fragment). 


10..118 

1 1 AO 


97/109 (88%) 
1UU/ lUy (yV7o) 


7e-50 


Q9I8H3 


Vacuolar ATP synthase 
subunit F (EC 3.6.3.14) 

^V-/\lraSer SUDUnil/ 

(Vacuolar proton pump F 
subunit) (V-ATPase 14 kDa 
subunit) - Xenopus laevis 
(African clawed frog), 110 aa 
(fragment). 


10..118 
1..109 


83/109 (76%) 
94/109 (86%) 


7e-43 



PFam analysis predicts that the NOV32a protein contains the domains shown in the 
Table 32F. 

5 



Table 32F. Domain Analysis of NOV32a 


Pfam Domain 


NOV32a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


ATP-syntJF 


8..108 


51/107(48%) 
90/107 (84%) 


9.2e^3 



Example 33, 

10 The NOV33 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 33A. 
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Table 33A. NOV33 Sequence Analysis 




|SEQIDNO:139 [l510bp j 


NOV33a, 
CG149463.01 


ATGGGTTCAGACTTTATGCCCTGAAAAGATCCTTC CAGCCCTGGCCATC TTGGACTTCTGGAGCTAr 


CCTGGCTCACAGGGGTCTTGTTGCCCTGGGTGTCCCCAGTTCTTGAAAAGAATCAGCCTGGGAGG(^ 




DNA Sequence 


TCATTTCTTGTTCCATCCATGCAGGGGTTGCTTACCTCGGGTAGGAAACCCTCAGGCGGTGGCARaT 


GCACAGGTAGGGGAGGATGGAGAGGGCAGTGGTGCCTGAAGCCCTGGATGGGCGGAGCTGACCCCCC 
AACACCAACTCTATCATGCCTGCTCCTCCCTGTCCCCCCAGAGCTGCCTGATCATTGCTACAGAATG 
AACTCTAGCCCAGCTGGGACCCCAAGTCCACAGCCCTCCAGGGCCAATGGGAACATCAACCTGGGGC 
CTTCAGCCAACCCAAATGCCCAGCCCACGGACTTCGACTTCCTCAAAGTCATCGGCAAAGGGAACTA 
CGGGAAGGTCCTACTGGCCAAGCGCAAGTCTGATGGGGCGTTCTATGCAGTGAAGGTACTACAGAAA 
AAGTCCATCTTAAAGAAGAAAGAGCAGAGCCACATCATGGCAGAGCGCAGTGTGCTTCTGAAGAACG 
TGCGGCACCCCTTCCTCGTGGGCCTGCGCTACTCCTTCCAGACACCTGAGAAGCTCTACTTCGTGCT 
CGACTATGTCAACGGGGGAGAGCTCTTCTTCCACCTGCAGCGGGAGCGCCGGTTCCTGGAGCCCCGG 
GCCAGGTTCTACGCTGCTGAGGTGGCCAGCGCCATTGGCT 

GGGATCTGAAACCAGAGAACATTCTCTTGGACTGCCAGTACTTGGCACCTGAAGTGCTTCGGAAAGA 
GCCTTATGATCGAGCAGTGGACTGGTGGTGCTTGGGGGCAGTCCTCTACGAGATGCTCCATGGCCTG 
CCGCCC TTC TAC AGCCAAGATGTATCCC AG ATGTATGAGAAC ATTC TGCACC AGCCGC TACAGATCC 
CCGGAGGCCGGACAGTGGCCGCCTGTGACCTCCTGCAAAGCCTTCTCCACAAGGACCAGAGGCAGCG 
GCTGGGCTCCAAAGCAGACTTTCTTGAGATTAAGAACCATGTATTCTTCAGCCCCATAAACTGGGAT 
GACCTGTACCACAAGAGGCTAACTCCACCCTTCAACCCAAATGTGACAGGACCTGCTGACTTGAAGC 
ATTTTCACCCAGAGTTCACCC^GGAAGCTGTGTCCAAGTCCATTGGCTGTACCCCCGACACTGTGGC 
CAGCAGCTCTGGGGCCTCAAGTGCATTCCTGGGATTTTCTTATGCGCCAGAGGATGATGACATCTTG 
GATTGTTAGAAG AG AAGGGCCTGTGAAAC TACTGAGGC C AGC TGGT ATTAGTAAGG AATTACC TTC A 




GC TGC TAGGAAGAGCGAC TCAAACTAACAATGGCTT 




ORF Start: ATG at 220 J ^ORF Stop: TAG at 1414 






SEQ ID NO: 140 (398 aa |MW at 44552.5kD 


NOV33a, 
CG149463-01 
Protein Sequence 


MQGLLTSGRKPSGGGRCTGRGGWRGQWCLKFVMG 

TPSPQPSRANGNINLGPSANPNAQPTDFDFLKVIGKGNYGKVLLAKRKSDGAFYAVKVLQKKS 
KEQSHIMAERSVLLKNVRHPFLVGLRYSFQTPEKL^^ 

EVASAIGYLHSLNIIYRDLKPENILLDCQYLAPEVLRKEPYDRAVDWWCLGAVLYEMLHGLPPFYSQ 
DVSQMYENILHQPLQIPGGRWAACDLLQSLLHKDQRQRLGSKADFLEIKNHVFFSPINWDDLYHKR 
LTPPFNPOTTGPADLKHFDPEFTQEAVSKSIGCTPiyiVASSSGASSAFLGFSYAPEDDDILDC 



10 Further analysis of the NOV33a protein yielded the following properties shown in 

Table 33B. 



Table 33B. Protein Sequence Properties NOV33a 


PSort analysis: 


0.4500 probability located in cytoplasm; 0.2677 probability located in 
microbody (peroxisome); 0.1859 probability located in lysosome (lumen); 
0.1000 probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 
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A search of the NOV33a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 33C. 



5 



Table 33C. Geneseq Results for NOV33a 




Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV33a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAY95276 


Human serum and ' 
glucocorticoid-induced 
protein kinase 2-beta - Homo 
sapiens, 427 aa. 
[WO200035946-A1, 
22-JUN-2000] 


1..398 
1..427 


398/427 (93%) 
398/427 (93%) 


0.0 


AAM25594 


Human protein sequence 
SEQIDNO:1109-Homo 
sapiens, 382 aa. 
[WO200153455-A2, 
26-JUL-2001] 


53..398 
8..382 ! 


346/375 (92%) 
346/375 (92%) 


0.0 


AAE22765 


Human serum and 
glucocoticoid-induced 
protein kinase, SGK2-alpha - 
Homo sapiens, 367 aa. 
[WO200224947-A2, 
28-MAR-2002] 


61..398 
1..367 


338/367 (92%) 
338/367 (92%) 


0.0 


AAB65708 


Novel protein kinase, SEQ 
ID NO: 237 - Homo sapiens, 
367 aa. [WO200073469-A2, 
07-DEC-2000] 


61..398 
1..367 


337/367 (91%) 
338/367 (91%) 


0.0 


AAB65615 


Novel protein kinase, SEQ 
ID NO: 141 - Mus musculus, 
244 aa. [WO200073469-A2, 
07-DEC-2000] 


184..398 
1..244 


215/244 (88%) 
215/244 (88%) 


e-122 



In a BLAST search of public sequence datbases, the NOV33a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 33D. 

10 



Table 33D. Public BLASTP Results for NOV33a 
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Protein 

Accession 

Number 


Protein/Organism/Length 


NOV33a ^ 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


,3r jT , 

Expect 
Value 


Q9HBY8 


Protein kinase - Homo sapiens 
(Human), 427 aa. 


1..398 
1..427 


398/427 (93%) 
398/427 (93%) 


0.0 


Q9UKG6 


Protein kinase (DJ138B7.2) 
(Serum/glucocorticoid 
regulated kinase 2) (Similar to 
serum/glucocorticoid 
regulated kinase 2) - Homo 
sapiens (Human), 367 aa. 


61..39& 

1..367 


338/367 (92%) 
338/367 (92%) 


A A 


Q8R0P6 


Serum/glucocorticoid 
regulated kinase 2 - Mus 
musculus (Mouse), 366 aa. 


61..397 
1..365 


317/366 (86%) 
326/366 (88%) 


0.0 


073927 


S-sgk2 - Squalus acanthias 
(Spiny dogfish), 594 aa. 


70..396 
236..594 


235/359(65%) 
277/359(76%) 


e-133 


073926 


S-sgkl - Squalus acanthias 
(Spiny dogfish), 433 aa. 


61..396 
60..433 


239/374 (63%) 
282/374 (74%) 


e-132 



PFam analysis predicts that the NOV33a protein contains the domains shown in the . 
Table 33E. 

5 



Table 33E. Domain Analysis of NOV33a 


Pram Domain 


NOV33a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


pkinase 


95..228 


54/135 (40%) 
116/135(86%) 


5e-39 


pkinase 


231..323 


35/128(27%) 
69/128(54%) 


1.5e-21 


pkinase_C 


324..393 


25/73 (34%) 
47/73 (64%) 


3.1e-15 



Example 34. 

10 The NOV34 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 34A. 
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Table 34A* NOV34 Sequence Analysis 




SEQIDNO: 141 ]2152bp 




GGGGGGCCTGAGCCTCTCCGCCGGCGCAGGCTCTGCTCGCGCCAGCTCGCTCCCGCAGCCATGCCCA 


NOV34a, 
CG149536-01 
DNA Sequence 


CCACC ATCGAGC GGGAGTTCGAAGAGTTGG AT ACTC AGCGTCGCTGGCAGCCGCTGTACT TGGAAAT 
TCGAAATGAGTC CCATGAC TATCC TC ATAGAGTGGCC AAGTTTCCAGAAAAC AGAAATCGAAACAGA 
TACAGAGATGTAAGCCCATATGATCACAGTCGTGTTAAACTGCAAAATGCTGAGAATGATTATATTA 
Ts.rpnrT* ar"n r PTaP r l v PPAP ATfAPA APAPGPAPAA AGG AGTTAPATPTTAAPACAGGGACCAPTTPPTAA 
CAC ATGCTGCC ATTTC TGGCTTATGGTTTGGC AG C AGAAG AC CAAAGCAGTTGTC ATGCTGAACCGC 
ATTGTGGAG AG AGAATCG AGTGGTGAAAC C AGAAC AATATCTCACTTTC ATT ATAC TACC TGGCCAG 
a^TTTOnARTrPPTfiAATCACCAGCTT^TTTCTCAATTTCTTGTTTAAAGTGAGAGAATCTGGCTC 
PTTf^AAPPPTfJAPPATGGGCCTGCGGTGATCCACTGTAGTGCAGGCATTGGGCGCTCTGGCACCTTC 
rppmpmr;r«m7vf3 ap APTTT5TPTTGTTTTGATGGAAAAAGGAGATGATATTAAC ATAAAACAAGTGTTAC 
mpn araTnana a A AT APPG A ATGGGTPTTATTCAGACCCCAGATCAACTGAGATTCTCATACATGGC 
t a t a a tap a Anp AGP A A A a tct AT A AAGGGAGATTCTAGTATAC AGAAACG ATGGAAAGAACTT TC T 
A app A artAPTTATPTPPTGPPTTTGATPATTCACCAAACAAAATAATGACTGAAAAATACAATGGGA 
arapaaTA pptp T AP A AG A AG A AAA AP TG AP AGGTGACCGATGTACAGGACT TTCC TC TAAAATGC A 
apatapaa TGG AGG AG A AP AGTG AGAGTGC TCTACGGAAjfCGTATTCGAGAGGACAGAAAGGC C ACC 
ar- apPTP AC2A APrtLTPPAnPAGATGAAAPAGAGGCTAJVATGAGAATGAACGAJVAJ^GAAAAAGGTGGT 
rnarpaTTPPpa aPPT^AT'TPTPAP^AAGATGGGGTTTATGTCAGTCATTTTGGTTGGCGCTTTTGTTGG 
pTpr , Ar , ar ,f rr ,r n l T | T ir r r rpanr AAA ATPPPPT ATAAAP AATTAATTTTGCCCAGCAAGCTTCTGCACTA 


rrn7\ ap rpr» ana r»TP pt a P A a A TP A T APPfJPTTTflTPTGP AGP A AACGPCTP ATATCC C AAAAACGG 


TGCAGTAGAATAGACATCAACCAGATAAGTGATATTTACAGTCACAAGCCCAACATCTCAGGACTCT 


TGACTGCAGGTTCCTCTGAJVCCCCAAACTGTAJ^TGGCTGTCTAAAATAAAGACATTCATCTTTC 


AAAAACTGGT AAATTTTGC AAC TGTATTC ATAC ATGTCAAACACAGT ATT TC ACC TG ACC AAC ATTG 


AG ATATCC TT TATC ACAGGATTTGTTTTTGGAGGC TATC TGGATT TTAAC CTGC ACTTGATATAAGC 


AATAAATATOGTGGTTTTATCTACGTTAOTGGAAAGAAAATGACATTOAAATA^ 


TAATGTACTATTGACATGGGCATCAACACTTTTATTCTTAAGCATTTCAGGGTAAATATATTTTATA 


AGTATCTATTTAATCTOTTGTAGTTAACTCTACTTTTOAAGAGCTCMTTTGAA 


AAAAAAAj^Aj^TxXSTATGTCGATTGAATTGTACTGGATACATTTTCCATTTTTCTAAAAAGAAGTTO 


ATATG AGC AGT TAGAAGTTGGAATAAGC AATTTC TACTATATATTGCATTTC TTTTATGTTTTACAG 


TTTTCCCC ATT TTAAAAAGAAAAGC AAAC AAAGAAAC AJVAAGTT TTTCCT AJ^AAATATC TT TG AAGG 


AAAATTCTCCTTACTCGGATAGTCAGGTAAACAGTTGGTCAAGACTTTOTAJ^GAAATTGGTTTCTG 


TAAATCCC ATT ATTGATATGTTT ATT TTTC ATG AAAATT TC AATGTAGTTGGGGTAGATTATGATTT 


AGGAAGCAAAAGTAAGAAGCAGCATTTTATCATTCATAATTTCAGTTTACTAGACTGAAGTTTTGAA 


GTAAACCC 




ORF Start: ATG at 61 | |ORF Stop: TAA at 1 171 



5 





SEQIDNO: 142 |370aa 


MWat43248.9kD 


NOV34a, 
CG149536-01 
Protein Sequence 


MPTTIEREFEELDTQIUlWQPLYLEII^ESHDYPHRVAKFPElsrRISlI^ 
YINASLVDIEEAQRSYILTQGPLPOTCCOTWlAM\mQQK^ 

WPDFGVPESPASFLlSTPLFxCVl^SGSLNPDHGPAVIHCSAGIGRSGTFSLVOTCLVx.^^ 
VLLNMRKYKMGLIQTPDQ^ 

NGmiGLEEEKLTGDRCTGLSSKMQiymEENSESALl^I^ 
RWLYVJQPILTKIaGFMSVILVGAFVGWRLFFQQNAL 



10 Further analysis of the NOV34a protein yielded the following properties shown in 

Table 34B. 



Table 34B. Protein Sequence Properties NOV34a 


PSort analysis: 


0.8500 probability located in endoplasmic reticulum (membrane); 0.4400 
probability located in plasma membrane; 0.3000 probability located in 
nucleus; 0.1000 probability located in mitochondrial inner membrane 
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SigqalP analysis: 



No Known Signal Sequence Predicted 



A search of the NOV34a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 34C. 



Table 34C. Geneseq Results for NOV34a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV34a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAR14114 


Non-receptor linked protein 
tyrosine phosphatase - Homo 
sapiens, 415 aa. 
[W09113989-A, 
19-SEP-1991] 


1..370 
1..415 


368/415 (88%) 
369/415 (88%) 


0.0 


AAU91293 


Human NOV8 protein - 
Homo sapiens, 415 aa. 
[WO200216600-A2, 
28-FEB-2002] 


1..370 
1..415 


337/415 (81%) 
345/415 (82%) 


0.0 


ABP41882 


Human ovarian antigen 
HOCPJ87, SEQ ID NO:3014 
- Homo sapiens, 368 aa. 
[WO200200677-A1, 
03-JAN-2002] 


24..336 
5..362 


312/358 (87%) 
313/358 (87%) 


e-178 


AAM25250 


Human protein sequence 
SEQIDNO:765-Homo 
sapiens, 168 aa. 
[WO200153455-A2, 
26-JUL-2001] 


116..269 
14..167 


137/154(88%) 
145/154(93%) 


le-77 


AAB56662 


Human prostate cancer 
antigen protein sequence 
SEQ ID NO: 1240 -Homo 
sapiens, 180 aa. 
[WO200055174-A1, 
21-SEP-2000] 


1..124 
29.. 152 


123/124(99%) 
124/124(99%) 


le-69 



In a BLAST search of public sequence datbases, the NOV34a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 34D. 
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Table 34D. Public BLASTP Results for NOV34a 


x roiciu 

Accession 

Number 


Protein/Oi^anism/Length 


NOV34a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


P17706 


Protein-tyrosine phosphatase, 
non-receptor type 2 (EC 
3.13.48) (T- cell 
nrotein-tvrosine DhosDhatase^ 
(TCPTP) - Homo sapiens 
(Human), 415 aa. 


L.370 
1..415 


369/415 (88%) 
370/415 (88%) 


0.0 


A33899 


nrntp i n -t vrn^i n p -r>h n<i"nh a ta 
(EC 3.1.3.48), nonreceptor type 
2 -human, 415 aa. 


1..370 
L.415 


368/415 C88%> 
369/415 (88%) 


0.0 


A60345 


protein-tyrosine-phosphatase 
(EC 3.1.3.48) 11A- human, 
387 aa. 


1..336 
1.381 


334/381 (87%) 
335/381 (87%) 


0.0 


Q922E7 


Protein tyrosine phosphatase, 
non-receptor type 2 - Mus 
musculus (Mouse), 406 aa. 


1..365 
1..405 


323/410 (78%) 
338/410(81%) 


0.0 


Q06180 


Protein-tyrosine phosphatase, 
non-receptor type 2 (EC 
3.1.3.48) (Protein-tyrosine 
phosphatase PTP-2) (MPTP) - 
Mus musculus (Mouse), 382 
aa. 


1..336 
1..376 


298/381 (78%) 
312/381 (81%) 


e-168 



5 PFam analysis predicts that the NOV34a protein contains the domains shown in the 

Table 34E. 



Table 34E. Domain Analysis of NOV34a 


Pfam Domain 


NOV34a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


Y_phosphatase 


42..229 


99/272(36%) 
163/272 (60%) 


5.5e-88 
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Example 35. 

The NOV35 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 35A. 



Table 35A. NOV35 Sequence Analysis 




SEQ ID NO: 143 


908 bp J_ 


NOV35a, 
CG149964-01 
DNA Sequence 


CCCTTCTACCCAGAGGGTGAATGGGTATCTTTCCCGGAATAATCCTAATTTTTCTAAGGGTGAAGTT 
TGC AACGGCGGC CGTGACTGTAAGCGGAC AC C AG AAAAGTACC AC TGTAAGTC ATGAG ATGTCTGGT 
CTGAATTGGAAACCCTTTGTATATGGCGGCCTTGCCTCTATCGTGGCTGAGTTTGGGACTTTCCCTG 
TGGACCTTACCAAAACACGACTTCAGGTTCAAGGCCAAAGCATTGATGCCCGTTTCAAAGAGATAAA 
ATATAGAGGGATGTTCC ATGCGC TGTTTCGC ATC TGTAAAGAGGAAGGTGTATTGGC TCTC TATTCA 
GGAATTGC TCCTGCG TTGCTAAGAC AAGCATC ATATGGC ACCATTAAAATTGGGATTTAC C AAAGCT 
TGAAGCGCTTATTCGTAGAACGTTTAGAAGATGAAACTCTTTTAATTAATATGATCTGTGGGGTAGT 
GTCAGGAGTGATATCTTCCACTATAGCCAATCCCACCGATGTTCTAAAGATTCGAATGCAGGCTCAA 
GGAAGCTTGTTCCAAGGGAGCATGATTGGAAGCTTTATCGATATATACCAACAAGAAGGCACCAGGG 
GTCTGTGGAGGGGTGTGGTTCCAACTGCTCAGCGTGCTGCCATCGTTGTAGGAGTAGAGCTACCAGT 
CTATGATATTACTAAGAAGCATTTAATATTGTCAGGAATGATGGGACATGTGGATCTCTATAAGGGC 
AC TGTTGATGGTATTTTAAAGATGTGGAAAC ATGAGGGC TTTTTTGCAC TCTATAAAGGATTTTGGC 
CAAACTGGCTTCGGCTTGGACCCTGGAACATCATTTTTTTTATTACATACGAGCAGGTAAAGAGGCT 
TCAAATCTAAGAACTGAATTATATGTGAGCCCAGCAC 




ORF Start: ATG at 21 


ORF Stop: TAA at 879 





SEQ ID NO: 144 |286aa 


MW at 32043.5kD 


NOV35a, 
CG149964-01 
Protein Sequence 


MGI F PG I IL I FLRVKFATAAVTVSGHQKST WSHEMSGLNTOPFVYGGLAS I VAEFGTFPVDLTKTR 
LQVQGQSIDARFKEIKYRGMFHALFRICKEEGVI^YSGIAPALLRQASYGTIKIGIYQSLKRLFVE 
RLEDETLLINMICGWSGVISSTIANPTDVLKIRMQAQGSLFQGSMIGSFIDIYQQEGTRGLWRGW 
PTAQRAAI WGVEL PVYD I TKKHL IL SGMMGHVDL YKGTVDG I LKMWKHEGF FAL YKGFWPNWLRLG 
PWNIIFFITYEQVKRLQI 





SEQ ID NO: 145 |871 bp 


NOV35b, 
309326356 DNA 
Sequence 


CACCGGATCCACCATGGGTATCTTTCCCGGAATAATCCTAATTTTTCTAAGGGTGAAGTTTGCAACG 
GCGGCCGTGATTCACC AGAAAAGTAC CACTGTAAGTC ATGAG ATGTC TGGTCTGAATTGGAAACCC T 
TTGTATATGGCGGCCTTGCCTCTATCGTGGCTGAGTTTGGGACTTTCCCTGTGGACCTTACCAAAAC 
ACGACTTCAGGTTCAAGGCCAAAGCATTGATGCCCGTTTCAAAGAGATAAAATATAGAGGGATGTTC 
CATGCGCTGTTTCGCATCTGTAAAGAGGAAGGTGTATTGGCTCTCTATTCAGGAATTGCTCCTGCGT 
TGC TAAGAC AAGCATCATATGGC ACCATTAAAATTGGG ATTTACC AAAGCTTGAAGCGCTTATTC GT 
AGAACGTTTAGAAGATGAAACTCTTTTAATTAATATGATCTGTGGGGTAGTGTCAGGAGTGATATCT 
TCCACTATAGCCAATCCCACCGATGTTCTAAAGATTCGAATGCAGGCTCAAGGAAGCTTGTTCCAAG 
GGAGCATGATTGGAAGCTTTATCGATATATACCAACAAGAAGGCACCAGGGGTCTGTGGAGGGGTGT 
GGTTCCAACTGCTCAGCGTGCTGCCATCGTTGTAGGAGTAGAGCTACCAGTCTATGATATTACTAAG 
AAGCATTTAATATTGTCAGGAATGATGGGACATGTGGATCTCTATAAGGGCACTGTTGATGGTATTT 
TAAAGATGTGGAAACATGAGGGCTTTTTTGCACTCTATAAAGGATTTTGGCCAAACTGGCTTCGGCT 
TGGACCCTGGAACATCATTTTTTTTATTACATACGAGCAGGTAAAGAGGCTTCAAATCGTCGACGGC 




ORF Start: at 2 |ORF Stop: end of sequence 
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SEQ ID NO: 146 


290 aa 


MWat32429.9kD 


NOV35b, 
309326356 
Protein Sequence 


TGSTMGIFPG 1 1 L I FLRVKFATAAVIHQKSTTVSHEMSGLNWKPFVYGGLAS IVAEFGTFPVDLTKT 
RLQVQGQS I DARFKEIKYRGMFHALFRICKEEGVLAL YSGIAPALLRQASYGTIKIGI YQSLKRLFV 
ERLEDETLLINMICGVVSGVISSTIANPTDVLKIRMQAQGSLFQGSMIGSFIDIYQQEGTRGLWRGV 
VPTAQRAAIWGTOLPVYDITKKHLILSGMKGHVDLYKGTVDG 
GPWNI IFF ITYEQVKRLQI VDG 





SEQ ID NO: 147 |811 bp 




NOV35c, 
309326444 DNA 
Sequence 


CACCGGATCCGCCGTGATTCACCAGAAAAGTACCACTGTAAGTCATGAGATGTCTGGTCTGAATTGG 
AAACCCTTTGTATATGGCGGCCTTGCCTCTATCGTGGCTGAGTTTGGGACTTTCCCTGTGGACCTTA 
CCAAAACACGACTTCAGGTTCAAGGCCAAAGCATTGATGCCCGTTTCAAAGAGATAAAATATAGAGG 


CCTGCGTTGC TAAGACAAGCATCAT ATGGCACC ATTAAAATTGGGATTT AC C AAAGCT TGAAGCGC T 
TATTCGTAGAACGTTTAGAAGATGAAACTCTTTTAATTAATATGATCTGTGGGGTAGTGTCAGGAGT 
GATATC TTCCAC TATAGCC AATCCC ACCGATGTTCTAAAGATTCGAATGCAGGCTC AAGG AAGCTTG 
TTCCAAGGGAGCATGATTGGAAGCTTTATCGATATATACCAACAAGAAGGCACCAGGGGTCTGTGGA 
GGGGTGTGGTTCCAACTGCTCAGCGTGCTGCCATCGTTGTAGGAGTAGAGCTACCAGTCTATOATAT 
TACTAAGAAGCATTTAATATTGTCAGGAATGATGGGACATGTGGATCTCTATAAGGGCACTGTTGAT 
GGTATTTTAAAGATGTGGAAACATGAGGGCTTTTTTGCACTCTATAAAGGATTTTGGCCAAACTGGC 
TTCGGCTTGGACCCTGGAACATCATTTTTTTTATTACATACGAGCAGGTAAAGAGGCTTCAAATCGT 
CGACGGC 




ORF Start: at 2 JORF Stop: end of sequence 





SEQ ID NO: 148 (270 aa JMW at 30239.1kD 


NOV35c, 
309326444 
Protein Sequence 


TGS AVIHQKSTTVSHEMSGLNWKPFVYGGLAS IVAEFGTF PVDLTKTRLQVQGQS I DARFKE I KYRG 
MFHALFRICKEEGVLALYSGIAPALLRQASYGTIKIGIYQSLKRLFVERLEDETLLINMICGWSGV 
ISSTIANPTDVLKIRMQAQGSLFQGSMIGSFIDIYQQEGTRGLWRGWPTAQRAAIWGVELPVYDI 
TKKHLII^GMMGHVDLYKGTVDGILKMWKHEGFFALYKGFWPNWLRLGPWNI IFFI TYEQVKRLQI V 
DG 





SEQ ID NO: 149 (761 bp J[ 


NOV35d, 
309326473 DNA 
Sequence 


CACCGGATCCCTGAATTGGAAACCCTTTGTATATGGCGGCCTTGCCTCTATCGTGGCTGAGTTTGGG 
ACTTTCCCTGTGGACCTTACCAAAACACGACTTCAGGTTCAAGGCCAAAGCATTGATGCCCGTTTCA 
AAGAGATAAAATATAGAGGGATGTTCCATGCGCTGTTTCGCATCTGTAAAGAGGAAGGTGTATTGGC 
TCTC TATTC AGGAATTGC TCCTGCGTTGC TAAGACAAGC ATCATATGGC ACC ATTAAAATTGGGATT 
TACCAAAGCTTGAAGCGCTTATTCGTAGAACGTTTAGAAGATGAAACTCTTTTAATTAATATGATCT 
GTGGGGTAGTGTCAGGAGTGATATCTTCCACTATAGCCAATCCCACCGATGTTCTAAAGATTCGAAT 
GCAGGCTCAAGGAAGCTTGTTCCAAGGGAGCATGATTGGAAGCTTTATCGATATATACCAACAAGAA 
GGCACCAGGGGTCTGTGGAGGGGTGTGGTTCCAACTGCTCAGCGTGCTGCCATCGTTGTAGGAGTAG 
AGCTACCAGTCTATGATATTACTAAGAAGCATTTAATATTGTCAGGAATGATGGGACATGTGGATCT 
CTATAAGGGCACTGTTGATGGTATTTTAAAGATGTGGAAACATGAGGGCTTTTTTGCACTCTATAAA 
GGATTTTGGCCAAACTGGCTTCGGCTTGGACCCTGGAACATCATTTTTTTTATTACATACGAGCAGG 
TAAAGAGGCTTCAAATCGTCGACG 




ORF Start: at 2 jORF Stop: end of sequence 
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SEQ 3D NO: 150 |254aa 


MW at 28488.2kD 


NOV35d, 
309326473 
Protein Sequence 


TGSLN^PFVYGGIASIVAEFGTFPVDL^ 

LYSGIAPALLRQASYGTIKIGIYQSLKRLFVERLEDETLLINMICGWSGVISSTIANPTDVLKIRM 
QAQGSLFQGSMIGSFIDIYQQEGTRGLWRGVVPTAQRAAIWGVELPVYDITKKHLILSGMMGHVDL 
YKGTVDG ILKMWKHEGFFALYKGFWPNWLR^ 





SEQ ID NO: 151 |l019bp 




NOV35e, 
CG149964-02 
DNA Sequence 


CTACCCAGAGGGTGAATGGGTATCTTTCCCGGAATAATCCTAATTTTTCTAAGGGTGAAGTTTGCAA 
CGGCGGCCGTGACTGTAAGCGGACACCAGAAAAGTACCACTGTAAGTCATGAGATGTCTGGTCTGAA 
TTGGAAACCCTTTGTATATGGCGGCCTTGCCTCTATCGTGGCTGAGTTTGGGACTTTCCCTGTGGAC 
CTTACCAAAACACGACTTCAGGTTCAAGGCCAAAGCATTGATGCCCGTTTCAAAGAGATAAZ^ATATA 
GAGGGATGTTCCATGCGCTGTTTCGCATCTGTAAAGAGGAAGGTGTATTGGCTCTCTATTCAGGAAT 
TGCTCCTGCGTTGCTAAGACAAGCATCATATGGCACCATTAAAATTGGGATTTACCAAAGCTTGAAG 
CGCTTATTCGTAGAACGTTTAGAAGATGAAACTCTTTTAATTAATATGATCTGTGGGGTAGTGTCAG 
GAGTGATATCTTCCACTATAGCG^TCCCACCGATGTTCTAAAGATTCGAATGCAGGCTCAAGGAAG 
CTTGTTCCAAGGGAGCATGATTGGAAGCTTTATCGATATATACCAGCAAGAAGGCACCAGGGGTCTG 
TGGAGGGGTGTGGTTCCAACTGCTCAGCGTGCTGCCATCGTTGTAGGAGTAGAGCTACCAGTCTATG 
ATATTACTAAGAAGCATTTAATATTGTCAGGAATGATGGGCGATAC^ATTTTAACTCACTTCGTTTC 
CAGCTTTACATGTGGTTTGGCTGGGGCTCTGGCCTCCAACCCGGTTGATGTGGTTCGAACTCGCATG 
ATGAACCAGAGGGCAATCGTGGGACATGTGGATCTCTATAAGGGCACTGTTGATGGTATTTTAAAGA 
TGTGGAAACATGAGGGCTTTTTTGCACTCTATAAAGGATTTTGGCCAAACTGGCTTCGGCTTGGACC 
CTGGAACATCATTTTTTTTATTACATACGAGCAGGTAAAGAGGCTTCAAATCTAAGAACTGAATTAT 
ATGTG AGC C C AGC C 




ORF Start: ATG at 16 j |ORF Stop: TAA at 991 





SEQ ID NO: 152 |325 aa 


MWat36175.2kD 


NOV35e, 
CG149964-02 
Protein Sequence 


MGIFPGIILIFLRVKFATAAVTVSGHQKSTWSHEMSGLI^PFVYGGI^SIVAEFGTFPVDLTKTR 
LQVQGQSIDARFKEIKYRGMFHALFRICKEEGVLALYSGIAPALLRQASYGTIKIGIYQSIJCRLFVE 
RLEDETLL INMICGWSGVI SST I ANPTDVLKIRMQAQGSLFQGSMIGSFIDIYQQEGTRGLWRGW 
PTAQRAAIWGVELPVYDITKKHLILSG^GDTILra 

IVGHVDLYKGTVDGI LKMWKHEGFFALYKGFWPNWLRLGPWWI I FF ITYEQVKRLQI 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 35B. 



Table 35B. Comparison of NOV35a against NOV35b through NOV35e. 


Protein Sequence 


NOV35a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV35b 


1..286 
5..287 


282/286(98%) 
282/286(98%) 


NOV35c 


26..286 
7.. 267 


261/261 (100%) 
261/261 (100%) 
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NOV35d 


3SL286 
4..25 1 


P ir li 1 1 r o tjy tsar err-=» x . J r . : 
248/248(100%) 

248/248 (100%) 


NOV35e 


1..286 
1..325 


286/325 (88%) 
286/325 (88%) 



Further analysis of the NOV35a protein yielded the following properties shown in 
Table 35C. 
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Table 35C. Protein Sequence Properties NOV35a 


PSort analysis: 


0.4600 probability located in plasma membrane; 0.2648 probability located in 
microbody (peroxisome); 0.1000 probability located in endoplasmic reticulum 
(membrane); 0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 20 and 21 



A search of the NOV35a protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 35D. 



Table 35D. Geneseq Results for NOV35a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV35a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAY94665 


Human uncoupling protein 
(UCP5) amino acid sequence 
- Homo sapiens, 325 aa. 
[WO200032624-A2, 
08-JUN-2000] 


1..286 
1..325 


284/325 (87%) 
285/325 (87%) 


e-158 


ABG33878 


Human secreted protein 
encoded by gene 16 - Homo 
sapiens, 334 aa. 
[WO200226931-A2, 
04-APR-2002] 


1..286 
1..334 


284/334 (85%) 
285/334 (85%) 


e-155 


AAE06056 


Human gene 16 encoded 
secreted protein HMIAP86, 
SEQ ID NO: 118 -Homo 
sapiens, 334 aa. 
[WO200151504-A1, 
19-IUL-2001] 


1..286 
1.J34 


284/334 (85%) 
285/334(85%) 


e-155 
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AAY87079 


Human secreted protein 
sequence SEQ ID NO:l 18 - 
Homo sapiens, 335 aa. 
[WO200004140-A1, 
27-JAN-2000] 


^ 

1..286 
1..334 


284/334 (85%) 
285/334 (85%) 


(awl 1 *u!!«l rmtliS. .1*. 41 

e-155 


AAY94666 


Human uncoupling protein 
isoform hUCP5S amino acid 
sequence - Homo sapiens, 
322 aa. [WO200032624-A2, 
08-JUN-2000] 


L.286 
L.322 


281/325 (86%) 
282/325 (86%) 


e-154 


In a BLAST search of public sequence datbases, the NOV35a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 35R 


Table 35E- Public BLASTP Results for NOV35a 


Protein 

Accession 
Number 


Protein/Organism/Length 


NOV35a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 




DJdJJl IIXiLOCIlUilUIlal CalTJCr 

protein-1 (BMCP-1) 
(Mitochondrial uncoupling 
protein 5) (UCP 5) (Solute 
carrier family 25, member 14) 
- Homo sapiens (Human), 325 
aa. 


1..ZOO 

1..325 i 


285/325 (87%) 




Q9Z2B2 


Brain mitochondrial earner 
protein-1 (BMCP-1) 
(Mitochondrial uncoupling 
protein 5) (UCP 5) (Solute 
carrier family 25, member 14) 
- Mus musculus (Mouse), 325 
aa. 


1..286 
1..325 


276/325 (84%) 
283/325 (86%) 


e-154 


Q9EP88 


Brain mitochondrial carrier 
protein BMCP1 (Brain 
mitochondrial earner 
protein-1) - Rattus norvegicus 
(Rat), 325 aa. 


1..286 
1..325 


274/325 (84%) 
282/325 (86%) 


e-153 


Q9JMH0 


Brain mitochondrial carrier 
protein-1 - Rattus norvegicus 
(Rat), 322 aa. 


1..286 
1..322 


271/325 (83%) 
279/325 (85%) 


e-149 


Q8R206 


Similar to RIKEN cDNA 
4933433D23 gene -Mus 
musculus (Mouse), 210 aa. 


36..232 
1..197 


160/197 (81%) 
176/197 (89%) 


le-87 
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PFam analysis predicts that the NOV35a protein contains the domains shown in the 
Table 35R 



Table 35F. Domain Analysis of NOV35a 


Pfam Domain 


NOV35a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


mito_carr 


39..138 


39/126(31%) 
78/126(62%) 


5.7e-24 


mito_carr 


140..231 


29/125 (23%) 
76/125 (61%) 


4.4e-27 


mito_carr 


233..286 


24/125 (19%) 
46/125 (37%) 


0.0072 



Example 36. 

The NOV36 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 36A. 



Table 36A. NOV36 Sequence Analysis 




SEQ ID NO: 153 1 1144 bp 




NOV36a, 
CG150306-01 
DNA Sequence 


CGCGGGGCGCGCGGCGCGGGGCGGCCTGGCCGGCGGCGGCGGCGGCATGAAGGTCACGTCGCTCGAC 


GGGCGCCAGCTGCGCAAGATGCTCCGCAAGGAGGCGGCGGCGCGCTGCGTGGTGCTCGACTGCCGGC 
CCTATCTGGCCTTCGCTGCCTCGAACGTGCGCGGCTCGCTCAACGTCAACCTCAACTCGGTGGTGCT 
GGACCAGGGCAGCCGCCACTGGCAGAAGCTGCGAGAGGAGAGCGCCGCGCGTGTCGTCCTCACCTCG 
CTACTCGCTTGCCTACCCGCCGGCCCGCGGGTCTACTTCCTCAAAGGGGGATATGAGACTTTCTACT 
CGGAATATCCTGAGTGTTGCGTGGATGTAAAACCCATTTCACAAGAGAAGATTGAGAGTGAGAGAGC 
CCTCATCAGCCAGTGTGGAAAACCAGTGGTAAATGTCAGCTACAGGCCAGCTTATGACCAGGGTGGC 
CCAGTTGAAATCCTTCCCTTCCTCTACCTTGGAAGTGCCTACCATGCATCCAAGTGCGAGTTCCTCG 
CCAACTTGCACATCACAGCCCTGCTGAATGTCTCCCGACGGACCTCCGAGGCCTGCATGACCCACCT 
ACACTACAAATGGATCCC TGTGGAAGACAGCCACACGGCTGACATTAGC TCCCACT TTC AAGAAGCA 
ATAGACTTCATTGACTGTGTCAGGGAAAAGGGAGGCAAGGTCCTGGTCCACTGTGAGGCTGGGATCT 
CCCGTTCACCCACCATCTGCATGGCTTACCTTATGAAGACCAAGCAGTTCCGCCTGAAGGAGGCCTT 
CGATTACA^CAAGCAGAGGAGGAGCATGGTCTCGCCCAACTTTGGCTTCATGGGCCAGCTCCTGCAG 
TACGAATCTGAGATCCTGCCCTCCACGCCCAACCCCCAGCCTCCCTCCTGCCAAGGGGAGGCAGCAG 
GCTCTTCACTGATAGGCCATTTGCAGACACTGAGCCCTGACATGCAGGGTGCCTACTGCACATTCCC 
TGCCTCGGTGCTGGCACCGGTGCCTACCCACTCAACAGTCTCAGAGCTCAGCAGAAGCCCTGTGGCA 
ACGGCCACATCCTGCTAAAACTGGGATGGAGGAATCGGCCCAGCCCCAAGAGCAACTGTGATTTTTG 


TTTTT 




ORF Start: ATG at 47 




ORF Stop: TAA at 1088 





SEQ ID NO: 154 


|347aa 


MW at 38362.6kD 


NOV36a, 


tmn?SLIX3RQLIUattRKEAAARCVVLDCRPYLAFAASNVRGSLtn/^NSVVLD(^ 
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ARWLTSLLACLPAGPRVYFLKGGYETFYSEYPE 

PAYDQGG P VEI LPFLYLGS AYHASK.C EFLANLH I TALLNVSRR T S EACMTHLHYKWI P VEDSH TAD I 
SSHFQEAIDFIDCVREKGGKVLVHCEAGISRSPTICMAYLMKTKQFRLKEAFDYIKQRRSMVSPNFG 
FMGQLLQYESEILPSTPNPQPPSCQGEAAGSSLIGHLQTLSPDMQGAYCTFPASVLAPVPTHSTVSE 
LSRSPVATATSC 



CG150306-01 
Protein Sequence 



Further analysis of the NOV36a protein yielded the following properties shown in 
Table 36B. 
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Table 36B. Protein Sequence Properties NO V36a 


PSort analysis: 


0.4811 probability located in mitochondrial matrix space; 04500 probability 
located in cytoplasm; 0.1892 probability located in mitochondrial inner 
membrane; 0.1892 probability located in mitochondrial interrnernbrane space 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV36a protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 36C. 



Table 36C. Geneseq Results for NOV36a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV36a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


ABB07842 


Amino acid sequence of 
protein identified by 
Swissprot Accn No. Q16690 
- Homo sapiens, 384 aa. 
[WO200220732-A2, 
14-MAR-2002] 


1..347 
1..384 


347/384 (90%) 
347/384 (90%) 


0.0 


AAB66440 


Human MAP-kinase 
phosphatase MKP-5 - Homo 
sapiens, 171 aa. 
[WO200102582-A1, 
ll-JAN-2001] 


116..286 
1-171 


171/171 (100%) 
171/171 (100%) 


le-97 


AAE06784 


Human dual-specificity 
phosphatase (DSP) protein, 
MKP-5 - Homo sapiens, 171 
aa. [WO200157221-A2, 
09-AUG-2001] 


116..286 
1..171 


171/171 (100%) 
171/171 (100%) 


le-97 



WO 03/029424 



PCT/US02/31373 



AAR63602 

« 


MAP-kinase-phosphatase 
CL100 - Homo sapiens, 367 
aa. [WO9423039-A, 
13-OCT-1994] 


1..347 P 
3-367 


168/388 (43%) 
220/388 (56%) 


«J« „jL «,lif J* 1 „ 

5e-72 


AAU84270 


Human endometrial cancer 
related protein, DUSP1 - 
Homo sapiens, 367 aa. 
[WO200209573-A2, 
07-FEB-2002] 


1..347 
3..367 


167/388 (43%) 
219/388 (56%) 


le-70 



In a BLAST search of public sequence datbases, the NOV36a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 36D. 



Table 36D. Pu 


iblic BLASTP Results for NOV36a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV36a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q16690 


Ljuai j>pcciiiciiy proicin 
phosphatase 5 (EC 3.1.3.48) 
(EC 3.1.3.16) (Dual 
specificity protein 
phosphatase hVH3) - Homo 
sapiens (Human), 384 aa. 


I. .34/ 

1..384 


347/384 (90%) 
347/384 (90%) 


0.0 


054838 


Dual specificity protein 
phosphatase 5 (EC 3.1.3.48) 
(EC 3.1.3.16) (MAP-kinase 
phosphatase CPG21) - Rattus 
norvegicus (Rat), 384 aa. 


1..347 
1..384 


320/384 (83%) 
336/384 (87%) 


0.0 


Q90W58 


MAP kinase phosphatase 
XCL100(beta) protein - 
Xenopus laevis (African 
clawed frog), 369 aa. 


13..347 
15..369 


164/378 (43%) 
217/378 (57%) 


9e-72 


P28562 


Dual specificity protein 
phosphatase 1 (EC 3.1.3.48) 
(EC 3.1.3.16) (MAP kinase 
phosphatase-1) (MKP-1) 
(Protein-tyrosine phosphatase 
CL100) (Dual specificity 
protein phosphatase hVHl) - 
Homo sapiens (Human), 367 
aa. 


1..347 
3..367 


167/388 (43%) 
219/388 (56%) 


3e-70 
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042253 


Dual specificity protein 


15..344 S " 






* 


phosphatase 1 (EC 3.1.3.48) 
(EC 3.1.3.16) (MAP kinase 
phosphatase-1) (MPK-1) 
(MAP kinase phosphatase-1) - 
Gallus gallus (Chicken), 353 
aa (fragment). 


4..353 


213/366 (57%) 





PFam analysis predicts that the NOV36a protein contains the domains shown in the 
Table 36E. 
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Table 36E. Domain Analysis of NOV36a 


Pfam Domain 


NOV36a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


Rhodanese 


7..98 


23/134(17%) 
66/134 (49%) 


0.0052 


DSPc 


141. .279 


76/172(44%) 
132/172 (77%) 


1.8e-70 


Y_phosphatase 


44..279 


39/336(12%) 
144/336 (43%) 


0.54 



Example 37. 

10 The NOV37 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 37A. 



Table 37A. NOV37 Sequence Analysis 




SEQIDNO:155 |2277bp | 


NOV37a, 
CG15051O-01 
DNA Sequence 


CGCGTTGTGGGCTCCCGCCGGGGTCCCCCGCGGCTGTCGCCGCCGCCTACGCCGCTGCCTCCGCCTT 


CCTGCCCCGCGTCGGGCCGGGCGCCACCTCCCCCCTGCCTCCCTCTCCGCTGTGGTCATTTAGGAAA 


TCGTAAATCATGTGAAGATGGGACTCTTGGTATTTGTGCGC 

TCTGGTACTGGGATTTTTGTATTATTCTGCGTGGAAGCTACACTTACTCCAGTGGGAGGAGGACTCC 
AGTAAGTATAGTCACTCTAGCTCACCCCAGGAGAAGCCTGTTGCAGATTCAGTGGTTCTTTCCTTTG 
ACTCCGCTGGACAAACACTAGGCTCAGAGTATGATCGGTTGGGCTTCCTCCTGAATCTGGACTCTAA 
ACTGCCTGCTGAATTAGCCACCAAGTACGCAAACTTTTCAGAGGGAGCTTGCAAGCCTGGCTATGCT 
TCAGCCTTGATGACGGCCATCTTCCCCCGGTTCTCCAAGCCAGCACCCATGTTCCTGGATGACTCCT 
TTCGCAAGTGGGCTAGAATCCGGGAGTTCGTGCCGCCTTTTGGGATCAAAGGTCAAGACAATCTGAT 
CAAAGCCATCTTGTCAGTCACCAAAGAGTACCGCCTGACCCCTGCCTTGGACAGCCTCCGCTGCCGC 
CGCTGCATCATCGTGGGCAATGGAGGCGTTCTTGCCAACAAGTCTCTGGGGTCACGAATTGACGACT 
ATGACATTGTGGTGAGACTGAATTCAGCACCAGTGAAAGGCTTTGAGAAGGACGTGGGCAGCAAAAC 
GACACTGCGCATCACCTACCCCGAGGGCGCCATGCAGCGGCCTGAGCAGTACGAGCGCGATTCTCTC 
TTTGTCCTCGCCX3GCTTCAAGTGGCAGGACTTTAAGTGGTTGAAATACATCGTCTACAAGGAGAGAG 
TGAGTGCATCGGATGGCTTCTGGAAATCTGTGGCCACTCGAGTGCCCAAGGAGCCCCCTGAGATTCG 
AATCCTCAACCCATATTTCATCCAGGAGGCCGCCTTCACCCTCATTGGCCTGCCCTTCAACAATGGC 
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CTCATGGGCCGGCK^AACATCCCTACCCTTGGCAGTGT 

ACGAGGTGGCAGTCGCAGGATTTGGCTATGACATGAGCACACCCAACGCACCCCTGCACTACTATGA 
GAC CGTTCGC ATGGC AGCC ATC AAAG AGTC CTGGACGCAC AATATCC AGCG AGAGAAAGAGTTTCTG 
pr:riAfir!rqY^Tr:a2ka^rTrr:rr:TrJXTrArTRATnTAAGCAGTGGCATCTGAGTGGGCCCAGCACATG 
GCCATAGAGGCCCAGGCACCACCAGGAGCAGCAGCCAGCACCACCTACACAGGAGTCTTCAGACCCA 


GAGAAGGACGGTGCCAAGGGCCCCAGGGGCAGCAAGGCCTTGGTGGAGCAGCCAGAGCTGTGCCTGC 


TCAGCAGCCAGTCTCAGAGACCAGCACTCAGCCTCATTCAGCATGGGTGCTTGATGCCAGAGGGCCA 


GCAGGCTCCTGGCTGTGCCCAGCAGGCCCAGCATGCAGGTGGTGGGACACTGGGCAGCAAGGCTGCT 


GCCGGAATCACTTCTCCAATCAGTGTTTGGTGTATTATCATTTTGTGAATTTGGGTAGGGGGGAGGG 


TAGGGATAATTTATTTTTAAATAAGGTTGGAGATGTCAAGTTGGGTTCACTTGCCATGCAGGAAGAG 


GCCCACTAGAGGGCCCATCAGGCAGTGTTACCTGTTAGCTCCCTGTGGGGCAGGAGTGCCAGGACCA 


GCCTGTACCTTGCTGTGGGGCTACAGGATGGTGGGCAGGATCTCAAGCCAGCCCCCTCCAGCTCATG 


ACACTGTTTGGCCTTTCTTGGGGAGAAGGCGGGGTATTCCCACTCACCAGCCCTAGCTGTCCCATGG 


GGAAACC CTGG AGCCATCC CT TC GGAGC C AAC AAGACCGCCCC AGGGC T ATAGC AGAAAGAACTTTA 


AAGCTCAGGAGGGTGACGCCCAGCTCCGCCTGCTGGGAAGAGCTCCCCTCCACAGCTGCAGCTGATC 


CATAGGACTACCGCAGGCCCGGACTCACCAACTTGCCACATGTTCTAGGTTTCAGCAACAAGACTGC 


CAGGTGGTTGGGTTCTGCC TT TAGC CTGGAC CAAAGGGAAGTGAGGCCC AAGGAGCTTACCC AAGCT 


GTGGCAGCCGTCCCAGGCCACCCCCATGGAAGCAATAAAGCTCTTCCCTGTAAAAAAAAAAAAAAA 




ORF Start: ATG at 152 lORF Stop: TGA at 1322 





SEQIDNO: 156 |390aa 


MWat 43785. IkD 


NOV37a, 
CG150510-01 
Protein Sequence 


MGLLVFVX^LIiALCLFLVLGFLYYSAWKLHLIiQWEEDSSKYSHSSSPQEKPVADSVVLSFDSAGQT 
LGSEYDRLGFLLNLDSKLPAELATKYANFSEGACKPGYASALMTAIFPRFSKPAPMFLDDSFRKWAR 
IREFVP PFG I KGQDNL I KAI LS VTKEYRLTPALDSLRCRRC 1 I VGNGG VL ANKSLG SRI DDYDI WR 
LNSAPVKGFEKDVGSKTTLRITYPEGAMQRPEQYERDSLFVLAGFKWQDFKWLKYIVYKERVSASDG 
FWKSVATRVPKEPPEIRILNPYFIQEAAFTLIGLPFI^GLMGRGNIPTLGSVAVTMTVLHGCDEVAVA 
GFGYDMSTPNAPLHYYETVRMAAIKESWTHNIQREKEFLRKLVKARVITDLSSGI 



Further analysis of the NOV37a protein yielded the following properties shown in 
Table 37B. 
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Table 37B. Protein Sequence Properties NO V37a 


PSort analysis: 


0.8200 probability located in outside; 0.2360 probability located in microbody 
(peroxisome); 0.1900 probability located in lysosome (lumen); 0.1000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 22 and 23 



A search of the NOV37a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 37C. 



Table 37C. Geneseq Results for NO V37a 



234 



WO 03/029424 



PCT/US02/31373 



Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV37a P 
Residues/ 
Match 
Residues 


Similarities for 
the Matched 
Region 


^'3.137 

Expect 
Value 


AAY39960 


Human alpha2-3 sialate 
transferase protein sequence - 
Homo sapiens, 375 aa. 
[JP11253163-A, 
21-SEP-1999] 


1..390 
1..375 


374/390(95%) 
375/390(95%) 


0.0 


AAR65242 


Human ST3N 
sialyltransferase - Homo 
sapiens, 375 aa. 
[WO9504816-A, 
16-FEB-1995] 


1.390 
1..375 


374/390 (95%) 
375/390(95%) 


0.0 




nuiiiaii 

a!pha-2,3-sialyltransferase 
(WM16) - Homo sapiens 
(melanoma WM266-4 cells), 
375 aa. [WO9423021-A, 
13-OCT-1994] 


1 390 
1..375 


374/390 (95%"> 
375/390 (95%) 


0 0 


AAR62808 


Alpha 2, 3-sialyl transferase - 
Homo sapiens, 375 aa. 
[JP06277052-A, 
04-OCT-1994] 


1..390 
1..375 


374/390 (95%) 
375/390 (95%) 


0.0 


AAR41671 


Rat sialyltransferase - Rattus 
rattus, 374 aa. 
[W09318157-A, 
16-SEP-1993] 


1..390 
1..374 


361/390 (92%) 
370/390(94%) 


0.0 



In a BLAST search of public sequence datbases, the NOV37a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 37D. 



Table 37D. Public BLASTP Results for NOV37a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV37a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q11203 


CMP-N-acetylneuraminate-beta- 1 ,4-gal 
actoside alpha-2,3- sialyltransferase (EC 
2.4.99.6) (N-acetyllactosaminide 
alpha-2,3- sialyltransferase) (Gal 
beta-l,3(4) GlcNAc alpha-2,3 
sialyltransferase) (ST3N) 
(Sialyltransferase 6) - Homo sapiens 
(Human), 375 aa. 


L.390 
L.375 


374/390(95%) 
375/390(95%) 


0.0 
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Q922X5 


Sialyltransferase (N-acetyllacosaminide 
alpha 2,3-sialyltransferase) - Mus 
musculus (Mouse), 374 aa. 


1..39(P Cn 
1..374 


371/390 (94%) 




Q9DBB6 


Sialyltransferase (N-acetyllacosaminide 
alpha 2,3- sialyltransferase) - Mus 
musculus (Mouse), 374 aa. 


1..390 
1..374 


360/390 (92%) 
371/390 (94%) 


0.0 


Q02734 


CMP-N-acetylneuraminate-beta-l,4-gal 
actoside alpha-2,3- sialyltransferase (EC 

alpha-2,3- sialyltransferase) (Gal 
beta-l,3(4) GlcNAc alpha-2,3 
sialyltransferase) (ST3N) 
(Sialyltransferase 6) - Rattus norvegicus 
(Rat), 374 aa. 


1..390 
1..374 


361/390 (92%) 
370/390 (94%) 


0.0 


P97325 


CMP-N-acetylneuraminate-beta-1 ,4-gal 
actoside alpha-2,3- sialyltransferase (EC 
1 4 99 6^ rN-acetvllactosaminide 
alpha-2,3- sialyltransferase) (Gal 
beta-l,3(4) GlcNAc alpha-2,3 
sialyltransferase) (ST3N) 
(Sialyltransferase 6) - Mus musculus 
(Mouse), 374 aa. 


1..390 
1..374 


359/390 (92%) 
370/390 (94%) 


0.0 



PFam analysis predicts that the NOV37a protein contains the domains shown in the 
Table 37E. 
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Table 37E. Domain Analysis of NOV37a 


Pfam Domain 


NOV37a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Glyco_transf_29 


101..389 


108/324 (33%) 
270/324 (83%) 


3.2e-116 



Example 38. 

10 The NOV38 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 38A. 



Table 38A. NOV38 Sequence Analysis 



ISEQIDNO: 157 



1076 bp 



NOV38a, I CCCTTA TGAAGACGGGACATTTTGAAATAGTCACCATGCTGCTGGCAACCATGATTCTAGTGGACAT 



236 



WO 03/029424 



PCT/US02/31373 



CG150704-01 
DNA Sequence 


TTTCCAGGTGAAGGCTGAAGTGTTAGACATGGCAG^lfr)^ 

ACGG ACAGGATGGAAAT TAAATACGT TCCCC AAC TGCTAAAGGAGGAAAAAGC AAGCC ACC AGCAAT 
T AGATAC TGTGTGGGAAAATGC AAAAGC C AAATGGGCAGCCCGAAAGACTC AAATCTT TCTCCCTAT 
GAATTTTAAGGATAACCATGGAATAGCCCTGATGGCATATATTTCCGAAGCTCAAGAGCAAACTCCC 
TTTTACCATCTGTTCAGTGAAGCTGTGAAGATGG CTGGCCAATCTCGAGAAGATTATATC TATGGCT 
TC CAGTTC AAAGCTTTC CAC T TTTAC C TC AC AAGAGCCCTGCAGT TGCTGAG AAAAC C TTGTGAGGC 
CAGTTCCAAAACTGTGGTATATAGAACAAGCCAGGGCACTTCATTTACATTTGGAGGGCTAAACCAA 
GCCAGGTTTGGCCATTTTACCTTGGCATATTCAGCCAAACCTCAGGCTGCTAATGACCAGCTCACTG 
TGTTATCCATCTACACATGCC TTGGAGTTG ACATTGAAAATTTTC TTG ATAAAGAAAGTGAAAGAAT 
TACTTTAATACCTCTGAATGAGGTTTTTCAAGTGTCACAGGAGGGGGCTGGCAATAACCTTATCCTT 
C AAAGC ATAAAC AAGAC CTGC AGCC AT TATGAGTGTGC ATTTC TAGGTGGAC TAAAAACCGAAAACT 
GTATTGAGAACCTAGAATATTTTCAACCCATCTATGTCTACAACCCTGGTGAGAAAAACCAGAAGCT 
TG AAGAC CATAGT G AGAAAAAC TGGAAGC TTG AAGAC C ATGGTG AG AAAAAC C AG AAG CTT G AAGAC 
CATGCTCCAGGTCCAGTTCCTGTTCCAGGTCCCAAAAGCCATCCTTCTGCATCCTCGGGCAAACTGC 
TGCTTCCACAGTTTGGGATGGTCATCATTTTAATCAGTGTTTCTGCTATAAATCTCTTTGTTGCTCT 
GTAG 




ORF Start: ATG at 6 




ORF Stop: TAG at 1074 





SEQIDNO: 158 


356 aa |MWat4031L7kD 


NOV38a, 
CG150704-01 
Protein Sequence 


MKTGHFEIVTMLLATMILVDIFQVKAEVLDMADNAFDDEYLKCTDRMEIKYOT 
TVWENAKAKWAARKTQ I FL PMNFKDNHGI ALMAYI SEAQEQTPF YHLFS EAVKMAGQSREDYI YGFQ 
FKAFHFYLTI^QLLRKPCEASSKTVVYRTSQGTSFTFGGLNQARFGHFTLAYSAKPQAANDQLTVL 
SIYTCLGVDIENFLDKESERITLIPLNEVFQVSQEGAGNNLILQSINKTCSHYECAFLGGLKTENCI 
EmiEYFQPIYVYNPGEKNQKLEDHSEKNWKLEDHGEKNQKLEDHAPGPVPVPGPKSHPSASSGKLL^ 
PQFGMVI ILI SVS AINLFVAL 



Further analysis of the NOV38a protein yielded the following properties shown in 
Table 38B. 
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Table 38B. Protein Sequence Properties NOV38a 


PSort analysis: 


0.6850 probability located in endoplasmic reticulum (membrane); 0.6400 
probability located in plasma membrane; 0.4600 probability located in Golgi 
body; 0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 27 and 28 



A search of the NOV38a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 38C. 



Table 38C. Geneseq Results for NOV38a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV38a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 



237 



WO 03/029424 



PCT/US02/31373 



AAR41876 


Human HT6 - Homo sapiens, 
230 aa. [DE4209216-A, 
23-SEP-1993] 


tj" 11 if 

29..256 
7..227 


82/238 (34%) 
120/238 (49%) 


le-21 


AAW76806 


Human 

ADP-ribosyltransferase 
protein - Homo sapiens, 327 
aa. [US5834310-A, 
10-NOV-1998] 


20..266 
31..287 


83/266(31%) 
123/266 (46%) 


6e-21 


AAW/6004 


Rabbit skeletal muscle 
ADP-ribosyltransferase 
protein - Oryctolagus 
cuniculus, 327 aa. 
[US5834310-A, 
10-NOV-1998] 


S..259 
6..280 


8S/282 (31%) 
130/282 (45%) 


le-20 


AAR37572 


Rabbit skeletal muscle 
ADP-ribosyltransferase - 
Oryctolagus cuniculus, 327 
aa. [USN7985698-N, 
01-MAY-1993] 


8..259 
6..280 


88/282 (31%) 
130/282 (45%) 


le-20 


ABB97573 


Novel human protein SEQ ID 
NO: 841 - Homo sapiens, 229 
aa. [WO200222660-A2, 
21-MAR-2002] 


29.. 163 
29.161 


59/137 (43%) 
76/137 (55%) 


le-20 


In a BLAST search of public sequence datbases, the NOV38a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 38D. 


Table 38D. Public BLASTP Results for NOV38a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV38a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


S62906 


mono-ADP-ribosyltransferase - 
human, 367 aa. 


1..356 
1..367 


356/367 (97%) 
356/367 (97%) 


0.0 


Q8WVJ7 


Hypothetical 42.7 kDa protein - 
Homo sapiens (Human), 378 
aa. 


1..356 
1..378 


355/378 (93%) 
355/378 (93%) 


0.0 


Q13508 


Ecto-ADP-ribosyltransferase 3 
precursor (EC 2.4.2.31) 
(NAD(P)(+)--arginine 
ADP-ribosyltransferase 3) 
(Mono(ADP-ribosyl)transferas 
e 3) - Homo sapiens (Human), 
389 aa. . 


1..356 
1..389 


355/389 (91%) 
355/389 (91%) 


0.0 
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PCT/US02/31373 



Q96HL1 


Unknown (protein for 
MGC: 14489) - Homo sapiens 
(Human), 389 aa. 


1 

1..356 
1..389 


354/389 (91%) 
354/389 (91%) 


<JL ««$ 

0.0 


Q9GKV6 


Hypothetical 38.2 kDa protein - 
Macaca fascicularis (Crab 
eating macaque) (Cynomolgus 
monkey), 338 aa. 


31. .356 
1..338 


300/338 (88%) 
312/338 (91%) 


e-174 



PFam analysis predicts that the NOV38a protein contains the domains shown in the 
Table 38E. 



Table 38E. Domain Analysis of NOV38a 


Pfam Domain 


NOV38a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


ART 


1..312 


164/340(48%) 
312/340 (92%) 


1.5e-200 



Example 39. 

The NOV39 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 39A. 



Table 39A. NOV39 Sequence Analysis 




SEQIDNO: 159 


8350 bp | 


NOV39a, 
CG150799-01 
DNA Sequence 


CAGGGAAAAGGGAACCTATGGAATGGTCATGGTGACTTTTGAGGTAGAGGGTGGCCCAAATCCCCCT 

GATGAAGATTTGAGTCCAGTTAAAGGAAATATCACCTTTCCCCCTGGCAGAGCAACAGTAATTTATA 

ACTTGACAGTACTCGATGACGAGGTACCAGAAAATGATGAAATATTTTTAATTCAACTGAAAAGTGT 

AGAAGGAGGAGCTGAGATTAACACCTCTAGGAATTCCATTGAGATCATCATTAAGAAAAATGATAGT 

CCCGTGAGATTCCTOCAGAGTATTTATTTGGTTCCTGAGGAAGACCACATACTCATAATTCCAGTAG 

TTCGTGGAAAGGACAAC AATGGAAATC TG ATTGGATCTGATGAATATGAGGT TTCAATCAGTTATGC 

TGTCACAACTGGGAATTCCACAGCACATGCCCAGCAAAATCT 

ACAACTGTTGTTTTTCCACCTTTT^^ 

CACCGGAGATTGCTGAATCGTTTCACATTATGTTACTAAAAGATACCTTACAGGGAGATGCTGTGCT 
AATAAGCCCTTCTGTTGTACAAGTCACCATTAAGCCAAATGATAAACCTTATGGAGTCCTTTCATTC 




CAGTGGTTAGAAATGGAGGAACCCATGGGAATGTCTCTGCGAATTGGGTGTTGACACGGAACAGCAC 
TGATCCCTCACCAGTAAC^GCAGATATCAGACCGAGCTCTGGAGTTCTCCATTTTGCACAAGGGCAG 
ATGTTGGCAACAATTCCTCTTACTGTGGTTGATGATGATCTTCCAGAAGAGGCAGAAGCTTATCTAC 
TTCAAATTCTGCCTCATACAATACGAGGAGGTGCAGAAGTGAGCGAGCCAGCGGAGGATAGTGATGA 
TGTCTATGGCCTAATAACATTTTTTCCTATGGAAAACCAGAAGATTGAAAGCAGCCCAGGTGAACGA 
TACTTATCCTTGAGTTTTACAAGACTAGGAGGGACTAAAGGAGATGTGAGGTTGCTTTATTCTGTAC 
TTTACATTCCTGCTGGAGCTGTGGACCCCTTGCAAGCAAAAGAAGGCATCOT 

AAATGACCTCATTTTTCCAGAGCAAAAAACTCAAGTCACTACAAAATTACC^TAAGAAATGATGCA 
TTCTTTCAAAATGGAGCTCACTTTCTAGTACAGTTGGAAACTGTGGAGTTGTTAAACATAATTCCTC 
TAATCCCACCCATAAGCCCTAGATTTGGGGAAATCTGCAATATTTCTTTACTGGTTACTCCAGCCAT 
TCCAAATGGAGAAATTGGCTTTCTCAGCAATCTTCCAATTATTTTGCATGAACCAGAAGATTTTGCT 
GCTGAAGTGGTATACATTCCCTTACATCGGGATGGAACTGATGGCCAGGCTACTGTCTACTGGAGTT 
TCAAGCCCTCTGGCTTTAATTCAAAAGCAGTGACCCCGGATGATATAGGCCCCTTTAATGGCTCTGT 



239 



WO 03/029424 



PCT/US02/31373 



I TTTC^^ 

ATGAATGAAACTGTAACACTTTCTCTAGACAGGGTTAACGTGGAAAACCAAGTGCTGAAATCTGGAT 
ATACTAGCCGTGACCTAATTATTTTGGAAAATGATGACCCTGGGGGAGTTTTTGAATTTTCTCCTGC 
TTCCAGAGGACCCTATGTTATAAAAGAAGGAGAATCTGTAGAGCTCCACATCATCCGATCAAGGGGG 
TCCCTTGTTAAGCAGTTTCTACACTACCGAGTAGAGCCAAGAGATAGCAATGAATTCTATGGAAACA 
CGGGAGTACTAGAATTTAAACCTGGAGAAAGGGAGATAGTGATCACCTTGCTAGCAAGATTGGATGG 
GATACCAGAGTTGGATGAACACTACTGGGTGGTCCTCAGCAGCCACGGAGAACGGGAAAGCAAGTTG 
GGAAGTGCCACCATTGTCAATATAACGATTCTGAAAAATGATGATCCTCATGGCATTATAGAATTTG 
TTTCTGATGGTCTAATTGTGATGATAAATGAAAGCAAAGGAGATGCTATCTATAGTGCTGTTTATGA 
TGTAGTAAGAAATCGAGGCAACTTTGGTGATGTTAGTGTATCATGGGTGGTTAGTCCAGACTTTACA 
C AAGATGTATTTCCTG TAC AAGGG AC TGTTGTCTTTGG AGATCAGGAATTTTC AAAAAATATC ACCA 
TTTACTCCCTTCCAGATGAGATTCCAGAAGAAATGGAAGAATTTAC CGTTATCC TACTGAATGGC AC 
TGG AGGAGC TAAAGTGGG AAATAG AAC AAC TGC AAC T C T GAGGATTAG AAG AAATGATGACCCC ATT 
■ TATTTTGCAGAACCTCGTGTAGTGAGGGTTCAGGAAGGTGAGACTGCCAACTTTACAGTTCTCAGAA 
ATGGATCTGTTGATGTGACTTGCATGGTCCAGTATGCTACCAAGGATGGGAAGGCTACTGCAAGAGA 
G AGAGATTTCATTCCTGTTGAAAAAGGAGAAACGC TCATTT TTGAGGTTGGAAGT AG AC AGC AGAGC 
; ATATCC ATATTTGTTAATGAAGATGGTATCC CGG AAAC AGATG AGCCCTTTTATATAATCC TCTTGA 
i ATTCAACAGGTGATACAGTAGTATATCAATATGGAGTAGCTACAGTAATAATTGAAGCTAATGATGA 
; CCCAAATGGCATTTTTTCTCTGGAGCC CATAGACAAAGCAGTGGAAGAAGGAAAGACTAATGCATTT 
TGGATTTTGAGGCACCGAGGATACTTTGGTAGTGTTTCTGTATCTTGGCAGCTCTTTCAGAATGATT 
CTGCTTTGCAGCCTGGGCAGGAGTTCTATGAAACTTCAGGAACTGTTAACTTCATGGATGGAGAAGA 
AGC AAAAC C AATC ATTC TCC ATGCTT T TC CAGAT AAAATTC CTGAATTC AATGAATTTTATTTCC TA 
AAAC TTGTAAAC ATTTCAGGTC CTGGGGGCC AGC T AGC AGAAAC CAAC CTCCAGGTGAC AGTAATGG 
TTCC ATTC AATG ATGATCCC TTTGGAG TT TT TATC TTGGAT CC AGAGTGTTTAG AGAGAGAAGTGGC 
AG AAG ATG TCC TGTCTG AAGATGATATGTC T TATATTAC C AACTTC AC C ATTTTGAGGC AGC AGGGT 
GTGTTTGGTGATGTAC AACTGGGCTGGGAAATAC TGTCCAG TGAGT TC C CTGCTGGTTTGCC ACC AA 
TGATAGATTTTTTACTGGTTGGAATTTTCCCCACCACCGTGCATTTACAACAGCACATGCGGCGTCA 
CCACAGTGGAACGGATGCTTTGTACTTTACCGGACTAGAGGGTGCATTTGGGACTGTTAATCCAAAA 
T ACC ATCCC TCC AGGAATAATAC AATTGC CAAC TTTAC ATTCTCAGCTTGGGT AATGCC CAATGC C A 
ATACGAATGGATTCATTATAGCGAAGGATGACGGTAATGGAAGCATCTACTACGGGGTAAAAATACA 
AACAAACGAATCCCATGTGACACTTTCCCTTCATTATAAAACCTTGGGTTCCAATGCTACATACATT 
GCCAAGACAACAGTCATGAAATATTTAGAAGAAAGTGTTTGGCTTCATCTACTAATTATCCTGGAGG 
ATGGTATAATCGAATTCTACCTGGATGGAAATGCAATGCCCAGGGGAATCAAGAGTCTGAAAGGAGA 
AGCCATTAC TGACGGTCC TGGGATAC TG AGAATTGGAGCAGGGATAAATGGC AATGACAGATTTAC A 
GGTCTGATGCAGGATGTGAGGTCCTATGAGCGGAAACTGACGCTTGAAGAAATTTATGAACTTCATG 
CC ATGCCCGC AAAAAGTGATTTAC ACCC AATTTC TGGATATCTGGAGTTCAGACAGGGAGAAAC TAA 
C AAATC ATTC ATTATT TC TGC AAG AGAT GAC AATG ACGAGGAAGGAGAAGAATTATTCATTC TTAAA 
C TAGTTTCTGTATATGG AGGAGC TCGTATTTCGGAAGAAAATACTACTGCAAGATTAACAATAC AAA 
AAAGTG AC AATGC AAATGGC TTGTTTGGT TT C AC AGGAGCTTGTATAC CAGAGATTGCAGAGGAGGG 
ATCAACCATTTCTTGTGTGGTTGAGAGAACCAGAGGAGCTCTGGATTATGTGCATGTTTTTTACACC 
ATTTC ACAGATTGAAACTGATGGCAT TAATT ACC TTGTTGATGACTTTGC TAATGCCAGTGGAACTA 
TTAC ATTCCTTC CTTGGC AGAG ATC AGAGGT TC TGAATATATATGTTC TTGATGATG ATATTCC TGA 
ACTTAATGAGTATTTCCGTGTGACATTGGTTTCTGCAATTCCTGGAGATGGGAAGCTAGGCTCAACT 
CCTACCAGTGGTGCAAGCATAGATCCTGAAAAGGAAACGACTGATATCACCATCAAAGCTAGTGATC 
ATCCATATGGCTTGCTGCAGTTCTCCACAGGGCTGCCTCCTCAGCCTAAGGACGCAATGACCCTGCC 
TGCAAGCAGCGTTCCACATATCACTGTGGAGGAGGAAGATGGAGAAATCAGGTTATTGGTCATCCGT 
GC ACAGGGAC TTCTGGGAAGGGTG AC TGCGGAAT TT AG AACAGTGTC C TTGAC AGCATTC AGTCCTG 
AGGATTACCAGAATGTTGCTGGCACATTAGAATTTCAACCAGGAGAAAGATATAAATACATTTTCAT 
AAAC ATC AC TG AT AATTC T ATTC CTGAACTGGAAAAATCTTTTAAAGT TGAGT TGTTAAACT TGGAA 
GGAGG AGTAGCTGAAC TC TT TAGGGTTGATGGAAG TGGTAGTGCC AGTCTAGGAGTGGC TTCCCAAA 
T TC TAGTGAC AATTGC AGCC TCTG AC C ACGC TC ATGG CGTATTTGAATTTAGCCC TGAGTC ACTCTT 
TGTCAGTGGAACTGAACCAGAAGATGGGTATAGCACTGTTACATTAAATGTTATAAGACATCATGGA 
ACTCTGTCTCCAGTGACTTTGCATTGGAACATAGACTCTGATCCTGATGGTGATCTCGCCTTCACCT 
CTGGCAACATCACATTTGAGATTGGGCAGACGAGCGCCAATATCACTGTGGAGATATTGCCTGACGA 
AGACCCAGAACTGGATAAGGCATTCTCTGTGTCAGTCCTCAGTGTTTCCAGTGGTTCTTTGGGAGCT 
CATATTAATGCCACGTTAACAGTTTTGGCTAGTGATGATCCATATGGGATATTCATTTTTTCTGAGA 
AAAACAGACCTGTTAAAGTTGAGGAAGCAACCCAGAACATCACACTATCAATAATAAGGTTGAAAGG 
CCTCATGGGAAAAGTCCTTGTCTCATATGCAACACTAGATGATATGGAAAAACCACCTTATTTTCCA 
CCTAATTTAGCGAGAGCAACTCAAGGAAGAGACTATATACCAGCTTCTGGATTTGCTCTTTTTGGAG 
CTAATCAGAGTGAGGCAACAATAGCTATTTCAATTTTGGATGATGATGAGCCAGAAAGGTCCGAATC 
TGTCTTTATCGAACTACTCAACTCTACTTTAGTAGCGAAAGTACAGAGTCGTTCAATTCCAAATTCT 
C CACGTCTTGGGCCTAAGGTAG AAACTATTGC GC AAC TAATTATC ATTGCCAATGATGATGCATTTG 
GAACTCTTCAGCTCTCAGCACCAATTGTCCGAGTGGCAGAAAATCATGTTGGACCCATTATCAATGT 
GAC TAGAACAGG AGGAGC ATTTGC AG ATGTC TC TGTGAAGTTTAAAGCTGTGCCAATAACTGCAATA 
GCTGGTGAAGATTATAGTATAGCTTCATCAGATGTGGTCTTGCTAGAAGGGGAAACCAGTAAAGCCG 
TGCCAATATATGTCATTAATGATATCTATCCTGAACTGGAAGAATCTTTTCTTGTGCAACTGATGAA 
TGAAACAACAGGAGGAGCCAGACTAGGGGCTTTAACAGAGGCAGTCATTATTATTGAGGCCTCTGAT 
GACCCCTATGGATTATTTGGTTTTCAGATTACTAAACTTATTGTAGAGGAACCTGAGTTTAACTCAG 
TGAAGGTAAACCTGCCAATAATTCGAAATTCTGGGACACTCGGCAATGTTACTGTTCAGTGGGTTGC 
CACCATTAATGGACAGCTTGCTACTGGCGACCTGCGAGTTGTCTCAGGTAATGTGACCTTTGCCCCT 

TCCAAGTGCAACTAACTGATGCCTCTGGTGGAGGTACTATTGGGTTAGATCGAATTGCAAATATTAT 
TATTC CTGCC AATGATGATCCTTATGGTACAGT AGCC TTTGC TCAGATGGTTTATC GTGTTC AAGAG 
CCTCTGGAAAGAAGTTCCTGTGCTAATATAACTGTCAGGCGAAGCGGAGGGCACTTTGGTCGGCTGT 
TGTTGTTCTACAGTACTTCCGACATTGATGTAGTGGCTCTGGCAATGGAGGAAGGTCAAGATTTACT 
GTCCTACTATGAATCTCCAATTCAAGGGGTGCCTGACCCACTTTGGAGAACTTGGATGAATGTCTCT 
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GCCGTGGGGGAGCCCCTCTATACCTGTCCCACTTT 

CATTTTTCAGTGCTTCTGAGGGTCCCCAGTGTTTCTGGATGACATCATGGATCAGCCCAGCTGTCAA 
C AATTC AG ACTTCTGG ACC TAC AGG AAAAAC ATGAC CAGGGT AGC ATC TC TTTTTAGTGGTCAGGCT 
GTGGC TGGGAGTGAC TATG AGCC TGTGACAAGGC AATGGGC C ATAATGC AGGAAGGTGATGAATTCG 
CAAATCTCACAGTGTCTATTCTTCCTGATGATTTCCCAGAGATGGATGAGAGTTTTCTAATTTCTCT 
CCTTGAAGTTCACCTCATGAACATTTCAGCCAGTTTGAAAAATCAGCCAACCATAGGACAGCCAAAT 
ATTTC TAC AGTTGTC ATAGCACTAAATGGTGATGCCTT TGGAGTGTTTGTGATCTAC AATATTAGTC 
CCAATACTTCCGAAGATGGCTTATTTGTTGAAGTTCAGGAGCAGCCCCAAACCTTGGTGGAGCTGAT 
GATACACAGGACAGGGGGCAGCTTAGGTCAAGTGGCAGTCGAATGGCGTGTTGTTGGTGGAACAGCT 
ACTGAAGGTTTAGATTTTATAGGTGCTGGAGAGATTCTGACCTTTGCTGAAGGTGAAACCAAAAAGA 
C AGTC ATTTTAACCATCTTGGATGAC TC TGAACC AGAGGATG ACG AAAGT ATCATAGTTAGTTTGGT 
GTACACTG AAGGTGG AAGTAG AATTT TGCC AAGC TC CG AC ACTGTTAGAGTGAACATT TTGGCCAAT 
GACAATGTGGCAGGAATTGTTAGCTTTCAGACAGCTTCCAGATCTGTCATAGGTCATGAAGGAGAAA 
TTTTACAATTCCATGTGATAAGAACTTTCCCTGGTCGAGGAAATGTTACTGTTAACTGGAAAATTAT 
TGGGCAAAATCTAGAACTCAATTTTGCTAACTTTAGCGGACAACTTTTCTTTCCTGAGGGGTCGTTG 
AATACAACATTGTTTGTGCATTTGTTGGATGACAACATTCCTGAGGAGAAAGAAGTATACCAAGTCA 
TTCTGTATGATGTCAGGACACAAGGAGTTCCACCAGCCGGAATCGCCCTGCTTGATGCTCAAGGATA 
TGCAGCTGTCCTCACAGTAGAAGCCAGTGATGAACCACATGGAGTTTTAAATTTTGCTCTTTCATCA 
AGATTTGTGTTACTACAAGAGGCTAACATAACAATTCAGCTTTTCATCAACAGAGAATTTGGATCTC 
TAGGAGCTATCAATGTCACATATACCACGGTTCCTGGAATGCTGAGTCTGAAGAACCAAACAGTAGG 
AAAC C T AGC AG AGCC AGAAGTTGATTTTGTCCCT ATCATTGGC TT TCTGATTTTAGAAGAAGGGGAA 
ACAGCAGCAGCCATCAACATTACCATTCTTGAGGATGATGTACCAGAGCTAGAAGAATATTTCCTGG 
TGAATTTAAC TT ACGTTGGACTTACC ATGGCTGC TTC AAC TTC ATTTCC TC CCAG AC TAGGT ATGAG 
anriTTTrTTGTTTGTTTCTTTTTGCTCACTTCAAATGAAATGAAGAAACTT^TTTTTGAATCAG^ 
G TG ATC ATTGTGC TGTTTTGTT AATC TT AGC TATGTGTT AAA 




ORF Start: ATG at 23 


ORF Stop: TGA at 8282 



|SEQ ID NO: 160 |2753 aa iMW at 301743.8kD 


NOV39a, ! 
CG150799-01 
Protein Sequence 


MVWTFEVEGGPNPPDEDLSPVKGNITFPPGRATVIYl^TVLDDEVPENDEIFLIQLKSVEGGAEIN 
TSRNS I EI I IKKNDS PVRFLQS I YLVPEEDHIL 1 1 P WRGKDNNGNL IGSDEYEVS I S YAVTTGNST 
AHAQQNLDFIDLQPNTTVVFPPFIHESHLKFQIVDDTTPEIAESFHIMLLKDTLQGDAVLISPSVVQ 
VTIKPNDKPYGVLSFNSVLFERWIIDEDRISRYEEITVVRNGGTHGNVSANWVLTRNSTDPSPVTA 
DIRPSSGVLHFAQGQMLATIPLTVVDDDLPEEAEAYLLQILPHTIRGGAEVSEPAEDSDDWGLITF 
FPMENQKI ES S PGERYLSLSFTRLGGTKGDWLLYSVLYIPAGAVDPLQAKEGILNI SRRNDL IFPE 
QKTQVTTKLPIRNDAFFQNGAHFLVQLETVELLNI I PL I PPISPRFGEICNI SLLVTPAI ANGEIGF 
LSNLPIILHEPEDFAAEVVTIPLHRIX5TDGQATVYWSLKPSGFNSKAVTPDDIGPFNGSVLFLSGQS 
DTT I NITI KGDDI PEMNETVTL SLDRVNVENQVLK SG YT SRDL 1 1 L ENDD PGGVFEF S PASRG P YV I 
KEGESVELHI IRSRGSLVKQFLHYRVEPRDSNEFYGNTGVLEFKPGEREI VI TLLARLDGI PELDEH 
YWWLSSHGERESKLGSAT I VNI T I LKNDDPHG 1 1 EFVSDGL I VMI NESKGDAI YS AVYDWRNRGN 
FGDVSVSWWSPDFTQDWPVQGTVVFGDQEFSKNITIYSLPDEIPEEMEEFTVILLNGTGGAKVGN 
RTTATLRI RRNDD P I YFAE PR WRVQEGETANF TVLRNGSVDVTCMVQ YATKDGKATARERDF I PVE 
KGETLIFEVGSRQQSISIFVNEDGIPETDEPFYIILLNSTGDTVVYQYGVATVIIEANDDPNGIFSL 
EPIDKAVEEGKTNAFWILRHRGYFGSVSVSWQLFQITOSALQPGQEFYETSGTVNFMDGEEAKPIILH 
AFPDKIPEFNEFYFLKLVNISGPGGQLAETNLQVTVWPFNDDPFGVFILDPECLEREVAEDVLSED 
DMSYITNFTILRQQGVFGDVQLGWEILSSEFPAGLPPMIDFLLVGIFPTTVHLQQHMRRHHSGTDAL 
YFTGLEGAFGTVNPKYHPSRNNT I ANFTFSAWVMPNANTNGF 1 1 AKDDGNGS I YYGYKIQTNESHVT 
LSLHYKTLGSNATYIAKTTVMKYLEESVWLHLLIILEDGIIEFYLDGNAMPRGIKSLKGEAITDGPG 
I LR IGAG INGMDRFTGLMQDVRS YERKLTL EE I YELHAMP AKSDLH P I SGYL EFRQGETNKSF 1 1 S A 
RDDNDEEGEELFILKLVSVYGGARISEENTTARLTIQKSDNANGLFGFTGAC I PEIAEEGSTISCW 
ELRTRGAIJDYVHVFYTISQIETDGINYLVD^ PELNEYFRV 
TLVSAIPGDGKLGSTPTSGASIDPEKETTDITIKASDHPYGLLQFSTGLPPQPKDAMTLPASSVPHI 
TVEEEDGE IRLIiVIRAQGLLGRVTAEFRTVSLTAFSPED YQNVAGTLEFQPGERYKYI F INITDNS I 
PELEKS FKVELLNLEGGVAELFRVDGSGSASLGVASQ ILVT IAASDHAHGVFEFSPESLFVSGTEPE 
DG YS TVTLNVI RHHGTL S P VTLHWNI DSDPDGDLAFT SGN I TFE IGQTS ANI TVE I L PDEDPELDKA 
FSVSVLSVSSGSLGAHINATLTVLASDDPYGIFIFSEKNRPVKVEEATQNITLS I IRLKGLMGKVLV 
SYATLDDMEKPPYFPPNLARATQGRDYI PASGFALFGANQSEAT I AI S ILDDDEPERS ESVFIELLN 
STLVAKVQSRS I PNS PRLGPKVETI AQL 1 1 1 ANDDAFGTLQLSAPI VRVAENHVGPI INVTRTGGAF 
ADVSVKFKAVPITAIAGEDYSIASSDWLLEGETSKAVPIYVINDIYPELEESFLVQLMNETTGGAR 
LGALTEAVI IIEASDDPYGLFGFQITKLIVEEPEFNSVKVNLPI IRNSGTLGNVTVQWVATINGQLA 
TGDLRWSGNVTFAPGETIQTLLLEVLAD0VPEIEEVIQVQLTDASGGGTIGLDRIANIIIPAM5DP 
YGTOAFAQMVYRVQEPLERSSCANITVRRSGGHFGRLLLFYSTSDIDWALA^ 
QGVPDPLVmTWMNVSAVGEPLYTCATLCLKEQACSAFSFFSASEGPQCFWMTSWISPAVNNSDFWTY 
RKNMTRVASLFSGQAVAGSDYEP\rTRQWAIMQEGDEFA3^WSILPDDFPEM)ESFLISLLEVHLMN 
ISASLKNQPTIGQPNISTWIALNGDAFGVWIYNISPNTSEDGLFVEVQEQPQTLVELMIHRTGGS 
LGQVAVEWRWGGTATEGLDFIGAGEILTFAEGETKKTVILTILDDSEPEDDESIIVSLVYTEGGSR 
ILPS SDTVRVNILANDNVAGIVSFQTASRSVIGHEGEILQFHVIRTFPGRGNVTVNWKI IGQNLELN 
FANFSGQLFFPEGSLNTTLFVHLLDDNIPEEKEVYQVILYDVRTQGVPPAGIALLDAQGYAAVLTVE 
ASDEPHGVLNFALSSRFVLLOEANITIOLFIOTEFGSLGAINVTYTTVPGMLSLKNOWGNLAEPEV 
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DFVPIIGFLILEEGETAAAINITILEDDVPELEEYF 

CSLQMK ^ 





SEQIDNO:161 [l 1925 bp | 


NOV39b, 
CG150799-02 
DNA Sequence 


CAGGG AAAAGGG AAC CTATGG AATGGTC ATGGTGAC TT TTG AGGT AGAGGGTGGC CC AAATC CCC C T 
GATGAAGATTTGAGTCCAGTTAAAGGAAATATCACCTTTCCCCCTGGCAGAGCAACAGTAATTTATA 
ACTTGACAGTACTCGATGACGAGGTACCAGAAAATGATGAAATATTTTTAATTCAACTGAAAAGTGT 
AGAAGGAGGAGCTGAGATTAACACCTCTAGGAATTCCATTGAGATCATCATTAAGAAAAATGATAGT 
CCCGTGAGATTCCTTCAGAGTATTTATTTGGTTCCTGAGGAAGACCACATACTCATAATTCCAGTAG 
TTCGTGGAAAGGACAACAATGGAAATCTGATTGGATCTGATGAATATGAGGTTTCAATCAGTTATGC 
TGTCACAACTGGGAATTCCACAGCACATGCCCAGCAAAATCTGGACTTCATTGATCTTCAGCCAAAC 
ACAACTGTTGTTTTTCCACCTTTTATTCATGAATCTCACTTGAAATTTCAAATAGTTGATGACACCA 
C ACC GG AGATTGCTG AATCGTTTCAC ATTATGTTAC TAAAAGATACCT TACAGGG AGATGC TGTGCT 
AATAAGCC C TTCTGTTGTAC AAGTC ACC ATTAAGCCAAATG ATAAACCTTATGGAGTC CTTTC ATTC 
AACAGTGTTTTGTTTGAAAGGACAGTTATAATTGATGAAGATAGAATATCAAGATATGAAGAAATCA 
C AGTGGTTAGAAATGGAGGAACCCATGGG AATGTC TC TGCG AATTGGGTGTTGAC AC GGAAC AGC AC 
TGATCCCTCACCAGTAACAGCAGATATCAGACCGAGCTCTGGAGTTCTCCATTTTGCACAAGGGCAG 
ATGTTGGCAACAATTCCTCTTACTGTGGTTGATGATGATCTTCCAGAAGAGGCAGAAGCTTATCTAC 
TTCAAATTCTGCCTCATACAATACGAGGAGGTGCAGAAGTGAGCGAGCCAGCGGAGGATAGTGATGA 
TGTC TATGGCC TAATAAC ATTTTTTC CTATGGAAAACC AGAAGATTGAAAGCAGCCC AGGTGAACGA 
TAC TTATCCTTGAGTTTTAC AAGACTAGGAGGGAC TAAAGGAGATGTG AGGTTGC TTTATTC TGTAC 
TTTACATTCCTGCTGGAGCTGTGGACCCCTTGCAAGCAAAAGAAGGCATCTTAAATATATCAAGGAG 
AAATGAC C TCATTT TTCCAGAGCAAAAAACTC AAGTC ACTACAAAATT AC C AATAAGAAATG ATGC A 
TTCTTTCAAAATGGAGCTCACTTTCTAGTACAGTTGGAAACTGTGGAGTTGTTAAACATAATTCCTC 
TAATCCCAC CC ATAAGCCCTAGATTTGGGGAAATC TGCAATATTTC TT TAC TGGTTAC TCC AGC CAT 
TGCAAATGGAGAAATTGGCTTTCTCAGCAATCTTCCAATTATTTTGCATGAACCAGAAGATTTTGCT 
GCTGAAGTGGTATACATTCCCTTACATCGGGATGGAACTGATGGCCAGGCTACTGTCTACTGGAGTT 
TGAAGCCCTCTGGCTTTAATTCAAAAGCAGTGACCCCGGATGATATAGGCCCCTTTAATGGCTCTGT 
TTTGTTTTTATCTGGGCAAAGTGACACAACAATCAACATTACTATCAAAGGTGATGACATACGGGAA 
ATGAATGAAAC TGTAACACTTTC TCTAGACAGGGT TAACGTGGAAAACCAAGTGC TGAAATC TGG AT 
AT AC TAGC C GTGACC TAATTATTTTGGAAAATGATG ACCCTGGGGGAGTTTTTGAATTTTCTCCTGC 
TTCC AG AGGACCC T ATGT TATAAAAGAAGGAGAATC TGTAGAGCTCC ACATC ATCCGATC AAGGGGG 
TCCC TTGTTAAGC AGTTTCTAC ACTACCGAGTAGAGCCAAGAG ATAGC AATGAATTC TATGG AAAC A 
CGGGAGTACTAGAATTTAAACCTGGAGAAAGGGAGATAGTGATCACCTTGCTAGCAAGATTGGATGG 
GATACCAGAGTTGGATGAACACTACTGGGTGGTCCTCAGCAGCCACGGAGAACGGGAAAGCAAGTTG 
GGAAGTGCCACCATTGTCAATATAACGATTCTGAAAAATGATGATCCTCATGGCATTATAGAATTTG 
TTTC TGATGGTC TAATTGTGATGAT AAATG AAAGC AAAGGAG ATGCTATCTATAGTGC TGTTTATGA 
TGTAGTAAGAAATCGAGGCAACTTTGGTGATGTTAGTGTATCATGGGTGGTTAGTCCAGACTTTACA 
CAAG ATGT ATT TCC TGTACAAGGGAC TGTTGTC TTTGGAGATCAGGAATTTTC AAAAAATAT CACC A 
TTTACTCCCTTCCAGATGAGATTCCAGAAGAAATGGAAGAATTTACCGTTATCCTACTGAATGGCAC 
TGGAGGAGCTAAAGTGGGAAATAGAACAACTGCAACTCTGAGGATTAGAAGAAATGATGACCCCATT 
TATTTTGCAGAACCTCGTGTAGTGAGGGTTCAGGAAGGTGAGACTGCCAACTTTACAGTTCTCAGAA 
ATGGATCTGTTGATGTGACTTGCATGGTCCAGTATGCTACCAAGGATGGGAAGGCTACTGCAAGAGA 
G AGAG ATTTCATTC C TGTTGAAAAAGGAGAAACGC TC ATTTTTGAGG TTGGAAGTAG AC AGC AGAGC 
ATATCCATATTTGTTAATGAAGATGGTATCCCGGAAACAGATGAGCCCTTTTATATAATCCTCTTGA 
ATTC AAC AGGTGATAC AGTAGTATATCAATATGGAGTAGC TACAGTAATAATTGAAGC TAATGATG A 
CCCAAATGGCATTTTTTCTCTGGAGCCCATAGACAAAGCAGTGGAAGAAGGAAAGACTAATGCATTT 
TGGATTTTGAGGCACCGAGGATACTTTGGTAGTGTTTCTGTATCTTGGCAGCTCTTTCAGAATGATT 
C TGC TT TGC AGCC TGGGC AGGAGTTCTATGAAACTTC AGGAACTGTTAACTTC ATGGATGG AGAAGA 
AGCAAAACCAATCATTCTCCATGCTTTTCCAGATAAAATTCCTGAATTCAATGAATTTTATTTCCTA 
AAAC TTGTAAACATTTC AGGTCC TGGGGGCC AGCTAGC AGAAACCAACCTCC AGGTGAC AGTAATGG 
r n r ppp ittp a zvTr:2XTr3a r nrr , P r r r n i no AnTTfTTATPTTCGATrr ag agtgtttagag agag aagtggc 
AGAAGATGTCCTGTCTGAAGATGATATGTCTTATATTACCAACTTCACCATTTTGAGGCAGCAGGGT 
GTGTTTGGTGATGTAC^CTGGGCTGGGAAATACTGTCCAGTGAGTTCCCTGCTGGTTTGCCACCAA 
TGATAGATTTTTTACTGGTTGGAATTTTCCCCACCACCGTGCATTTACAACAGCACATGCGGCGTCA 
CCACAGTGGAACGGATGCTTTGTACTTTACCGGACTAGAGGGTGCATTTGGGACTGTTAATCCAAAA 
TACCATCCCTCCAGGAATAATACAATTGCCAACTTTACATTCTCAGCTTGGGTAATGCCCAATGCCA 
ATACGAATGGATTCATTATAGCGAAGGATGACGGTAATGGAAGCATCTACTACGGGGTAAAAATACA 
AACAAACGAATCCCATGTGACACTTTCCCTTCATTATAAAACCTTGGGTTCCAATGCTACATACATT 
GCCAAGACAACAGTCATGAAATATTTAGAAGAAAGTGTTTGGCTTCATCTACTAATTATCCTGGAGG 
ATGGTATAATCGAATTCTACCTGGATGGAAATGCAATGCCCAGGGGAATCAAGAGTCTGAAAGGAGA 
AGC C ATT AC TGACGGTCCTGGGATACTGAGAATTGGAGCAGGGATAAATGGC AATGACAGATTTAC A 
GGTCTGATGCAGGATGTGAGGTCCTATGAGCGGAAACTGACGCTTGAAGAAATTTATGAACTTCATG 
C C ATGCCCGCAAAAAGTGAT T TAC ACCCAATTTC TGGATATC TGGAGTTCAG AC AGGGAGAAAC TAA 
CAAATCATTCATTATTTCTGCAAGAGATGACAATGACGAGGAAGGAGAAGAATTATTCATTCTTAAA 
CTAGTTTCTGTATATGGAGGAGCTCGTATTTCGGAAGAAAATACTACTGCAAGATTAACAATACAAA 
AAAGTGACAATGCAAATGGCTTGTTTGGTTTCACAGGAGCTTGTATACCAGAGATTGCAGAGGAGGG 
ATC AACCAT TTCTTGTGTGGTTGAGAGAACC AGAGGAGCTCTGGAT TATGTGCATGT TTTTTAC AC C 
ATTTCACAGATTGAAACTGATGGCATTAATTACCTTGTTGATGACTTTGCTAATGCCAGTGGAACTA 
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TTACATTCCTTCCTTC 

AC TTAATGAGTATTTC CGTGTGACATTGGTTTC TGC AATTC CTGGAGATGGGAAGC TAGGC TC AAC T 

C C TACC AGTGGTGC AAGCATAGATCCTGAAAAGG AAACGAC TGATATC AC C ATCAAAGCTAGTG ATC 

ATCC ATATGGC TTGC TGCAGTTCTCC ACAGGGC TGC CTC CTCAGCC TAAGG AC GC AATGAC CC TGC C 

TGCAAGCAGCGTTCCACATATCACTGTGGAGGAGGAAGATGGAGAAATCAGGTTATTGGTCATCCGT 

GCACAGGGACTTCTGGGAAGGGTGACTGCGGAATTTAGAACAGTGTCCTTGACAGCATTCAGTCCTG 

AGGATTAC C AGAATGTTGCTGG C ACATTAGAATTTC AAC C AGG AG AAAG ATATAAATACATTTTC AT 

AAAC ATCAC TGAT AATTC TATTC CTGAAC TGGAAAAATCTTTTAAAGTTGAG T TGTTAAAC TTGGAA 

GGAGG AGTAGCTG AAC TCTTTAGGGTTGATGGAAGTGGTAGTGCC AGTC TAGG AGTGGCTTCCC AAA 

TTCTAGTGACAATTGCAGCCTCTGACCACGCTCATGGCGTATTTGAATTTAGCCCTGAGTCACTCTT 

TGTCAGTGGAACTGAACCAGAAGATGGGTATAGCACTGTTACATTAAATGTTATAAGACATCATGGA 

ACTCTGTCTCCAGTGACTTTGCATTGGAACATAGACTCTGATCCTGATGGTGATCTCGCCTTCACCT 

CTGGCAACATCACATTTGAGATTGGGCAGACGAGCGCCAATATCACTGTGGAGATATTGCCTGACGA 

AGACCCAGAACTGGATAAGGCATTCTCTGTGTCAGTCCTCAGTGTTTCCAGTGGTTCTTTGGGAGCT 

CATATTAATGCCACGTTAACAGTTTTGGCTAGTGATGATCCATATGGGATATTCATTTTTTCTGAGA 

AAAAC AGAC CTGTTAAAGTTGAGGAAGC AACCC AGAAC ATCACAC T ATCAATAATAAGGTTGAAAGG 

CCTCATGGGAAAAGTCCTTGTCTCATATGCAACACTAGATGATATGGAAAAACCACCTTATTTTCCA 

CCTAATTTAGCGAGAGCAACTCAAGGAAGAGACTATATACCAGCTTCTGGATTTGCTCTTTTTGGAG 

CTAATCAGAGTGAGGCAACAATAGCTATTTCAATTTTGGATGATGATGAGCCAGAAAGGTCCGAATC 

TGTC TT TATCGAAC TAC TC AAC TCTACTTTAGT AGCGAAAGTAC AGAGTCGTTC AAT TCC AAATTCT 

CCACGTC TTGGGCC TAAGGTAGAAACTATTGCGC AACTAATTATCATTGC C AATGATGATGC ATTTG 

GAACTCTTCAGCTCTCAGCACCAATTGTCCGAGTGGCAGAAAATCATGTTGGACCCATTATCAATGT 

GACTAGAACAGGAGGAGCATTTGCAGATGTCTCTGTGAAGTTTAAAGCTGTGCCAATAACTGCAATA 

GC TGGTG AAG ATTATAGTATAGC TTC ATCAGATGTGGTC TTGC TAG AAGGGG AAACC AGTAAAG CCG 

TGCCAATATATGTCATTAATGATATCTATCCTGAACTGGAAGAATCTTTTCTTGTGCAACTGATGAA 

TGAAAGAACAGGAGGAGCCAGACTAGGGGCTTTAACAGAGGCAGTCATTATTATTGAGGCCTCTGAT 

GACCCCTATGGATTATTTGGTTTTCAGATTACTAAACTTATTGTAGAGGAACCTGAGTTTAACTCAG 

TGAAGGTAAACCTGCCAATAATTCGAAATTCTGGGACACTCGGCAATGTTACTGTTCAGTGGGTTGC 

CACCATTAATGGACAGCTTGCTACTGGCGACCTGCGAGTTGTCTCAGGTAATGTGACCTTTGCCCCT 

GGGGAAAC C ATTC AAACCTTGTTGTTAGAGGTC C TGGC TGACGACGTTCCGGAG ATTGAAGAGGTT A 

TCCAAGTGCAACTAACTGATGCCTCTGGTGGAGGTACTATTGGGTTAGATCGAATTGCAAATATTAT 

TATTCCTGCCAATGATGATCCTTATGGTACAGTAGCCTTTGCTCAGATGGTTTATCGTGTTCAAGAG 

CCTCTGGAAAGAAGTTCCTGTGCTAATATAACTGTCAGGCGAAGCGGAGGGCACTTTGGTCGGCTGT 

TGTTGTTC TAC AGTACTTCCGACATTGATGTAGTGGCTC TGGC AATGGAGGAAGGTC AAGATTT AC T 

GTCCTACTATGAATCTCC AATTC AAGGGGTGC C TGACCCACTTTGGAGAACTTGGATGAATGTC TC T 

GCCGTGGGGGAGCCCCTGTATACCTGTGCCACTTTGTGCCTTAAGGAACAAGCTTGCTCAGCGTTTT 

CATTTTTCAGTGCTTCTGAGGGTCCCCAGTGTTTCTGGATGACATCATGGATCAGCCCAGCTGTCAA 

CAATTCAGACTTCTGGACCTACAGGAAAAACATGACCAGGGTAGCATCTCTTTTTAGTGGTCAGGCT 

GTGGCTGGGAGTGACTATGAGCCTGTGACAAGGCAATGGGCCATAATGCAGGAAGGTGATGAATTCG 

C AAATC TC AC AGTGTC T ATTCTTCC TGATGATT TCC CAGAGATGGATGAGAGTTT TC TAATTTC TC T 

CC TTGAAGTTCACC TC ATGAACATTTC AGCCAGTTTGAAAAATC AGC CAACC ATAGGACAGCC AAAT 

ATTTCTAC AGTTGTC ATAGC ACTAAATGGTGATGC C TTTGGAGTGTTTGTGATC TACAGTATTAGTC 

CCAATACTTCCGAAGATGGCTTATTTGTTGAAGTTCAGGAGCAGCCCCAAACCTTGGTGGAGCTGAT 

GATACACAGGACAGGGGGCAGCTTAGGTCAAGTGGCAGTCGAATGGCGTGTTGTTGGTGGAACAGCT 

ACTGAAGGTTTAGATTTTATAGGTGCTGGAGAGATTCTGACCTTTGCTGAAGGTGAAACCAAAAAGA 

CAGTCATTTTAACCATCTTGGATGACTCTGAACCAGAGGATGACGAAAGTATCATAGTTAGTTTGGT 

GTACAC TGAAGGTGG AAGTAGAATTTTGCC AAGCTCCGACACTGTTAGAGTGAACATTTTGGC C AAT 

GACAATGTGGC AGG AATTGTTAGCTTTC AGAC AGC TTCCAGATC TGT CATAGGTCATGAAGGAGAAA 

T TTTACAATTCC ATGTGATAAGAACTTTC CC TGGTCGAGG AAATGTT AC TGTTAAC TGGAAAATTAT 

TGGGC AAAATC TAGAAC TC AATTTTGCTAAC TTTAGCGG AC AAC TTT TC TTTCC TGAGGGGTC GTTG 

AATACAACATTGTTTGTGCATTTGTTGGATGACAACATTCCTGAGGAGAAAGAAGTATACCAAGTCA 

TTCTGTATGATGTCAGGACACAAGGAGTTCCACCAGCCGGAATCGCCCTGCTTGATGCTCAAGGATA 

TGCAGCTGTCCTCACAGTAGAAGCCAGTGATGAACCACATGGAGTTTTAAATTTTGCTCTTTCATCA 

AGATTTGTGTTACTACAAGAGGCTAACATAACAATTCAGCTTTTCATCAACAGAGAATTTGGATCTC 

TAGGAG CTATC AATGTC ACATATACCACGGTTCC TGGAATGCTG AGTC TGAAGAACCAAACAGTAGG 

AAACCTAGCAGAGCCAGAAGTTGATTTTGTCCCTATCATTGGCTTTCTGATTTTAGAAGAAGGGGAA 

ACAGCAGCAGCCATCAACATTACCATTCTTGAGGATGATGTACCAGAGCTAGAAGAATATTTCCTGG 

TGAATTTAACTTACGTTGGACTTACCATGGCTGCTTCAACTTCATTTCCTCCCAGACTAGATTCAGA 

AGGTTTGACTGCACAAGTTATTATTGATGCCAATGATGGGGCCCGAGGTGTAATTGAATGGCAACAA 

AGCAGGTTTGAAGTAAATGAAACCCATGGAAGTTTAACATTGGTAGCCCAGAGGAGCAGAGAACCTC 

TTGGCCATGTTTCCTTATTTGTGTATGCTCAGAATTTGGAAGCACAAGTGGGGCTGGATTATATCTT 

CACCCCAATGATTCTTCATTTTGCTGATGGAGAAAGGTATAAAAATGTCAATATCATGATTCTTGAT 

GATGACATTCCAGAAGGAGATGAAAAATTTCAGCTGATTTTAACAAATCCTTCTCCTGGACTAGAGC 

TAGGGAAAAATAC AATAGCC TTAATTATTGTCC T TGCTAATGATGACGGCCC TGGAGT TC TATC ATT 

TAACAACAGTGAGCACTTTTTCCTAAGAGAGCCAACAGCTCTCTACGTCCAGGAGAGTGTTGCAGTA 

TTGTACATTGTTCGGGAACCTGCACAAGGATTGTTTGGAACAGTGACAGTTCAGTTCATTGTGACAG 

AAGTGAATTC C TC AAATG AATC TAAAGATCTG ACTCCTTCCAAAGGCTATATTGTT TTAGAAGAAGG 

TGTTCGATTCAAGGCCCTACAAATATCTGCCATATTAGACACGGAACCAGAAATGGATGAGTATTTT 

GTTTGCACCTTGTTTAATCCAACTGGAGGTGCTAGACTAGGGGTGCATGTTCAAACCCTGATAACAG 

TTTTGC AAAACCAGGCC CCTTTGGGGC TATTCAGTATC TCTGC AGTTGAAAATAGAGCC ACCTCC AT 

AGACATCGAAGAAGCCAATAGGACCGTGTATTTAAATGTATCTCGAACTAATGGCATTGATTTGGCT 

GTGAGTGTGCAGTGGGAGACAGTATCTGAAACAGCCTTTGGCATGAGGGGAATGGATGTTGTGTTTT 

CCGTATTTCAAAGTTTTTTGGATGAATCAGCTTCTGGCTGGTGTTTCTTTACTTTGGAAAATTTAAT 

ATATGGTATAATGTTAAGAAAATCATCTGTTACTGTTTACCGATGGCAGGGGATTTTTATTCCAGTT 

GAGGATTTAAATATAGAAAATCCTAAAAC TTGTGAGGC C TTTAATATTGGTTTTTCTCCC TACTTTG 

TGATTACTCATGAAGAAAGAAATGAAGAAAAGCCTTCTCTTAACAGTGTGTTTACATTCACATCTGG 
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ATTTAAATTATTCCTGGTACAAACAATCATT^ 

GAC AGC CAAGATTATTTAATC ATTGC AAGTC AAAGAG ATGATTCCGAATTAAC TC AGGTC TTC AGGT 
GGAATGGAGGAAGCTTCGTGTTGCATCAAAAACTCCCTGTCCGAGGTGTGCTGACCGTGGCCTTGTT 
CAACAAGGGAGGCTCTGTGTTCTTAGCCATTTCCCAGGCTAATGCCAGGCTAAACTCCCTTTTATTC 
AGATGGTCTGGCAGTGGGTTTATTAACTTTCAAGAGGTGCCTGTCAGTGGGACAACAGAAGTTGAGG 
CT TTGTC TTC AGCC AATGATATTTAC CTAATATTTGCC AAAAATGTCTTTCTAGG AGATCAGAATTC 
AATTGATATTTTCATCTGGGAGATGGGACAGTCTTCCTTCAGGTATTTTCAGTCTGTAGATTTTGCT 
GC TGT TAACAGAATC C AC TCCTTC AC ACC AGC CTCAGGAATAGC CC AC ATACTTC TTATTGGC C AAG 
ATATGTCTGCTCTTTACTGCTGGAATTCGGAGCGTAATCAATTCTCTTTTGTTCTGGAAGTACCTTC 
TGCTTATGATGTGGCTTCTGTTACAGTAAAGTCCCTTAATTCAAGCAAGAATTTAATAGCTCTAGTG 
GGAGC TC ATTCAC ATATATATG AGC TAGC CTAC ATT TCC AGC C ATTC TGAC TTTAT TC CTAGTTCAG 
GTGAACTGATATTTGAACCTGGTGAGAGAGAAGCTACAATAGCAGTAAATATCCTTGATGATACAGT 
TCCAGAAAAAGAAGAATCCTTCAAAGTTCAACTTAAAAATCCCAAAGGAGGAGCAGAGATTGGCATT 
AATGATTCTGTAAC AATAAC CAT TC TGTC TAATGATGATGCC TATGGAATTGTTGCAT TTGCTC AGA 
ATTC ATT ATATAAGC AAGTGGAAG AAATGGAGC AAGATAGC C T AGT AAC CTTG AAC GTTGAACGC TT 
AAAAGGAACATATGGCCGTATAACCATAGCATGGGAAGCTGATGGAAGTATTAGTGATATATTTCCT 
AC CTCAGGAGTGATTTTATTTAC TGAAGGCCAGGT ACTGTC AAC AATC ACTCTAACTATTCTTGC TG 
ATAATATACCAGAGTTATC AGAGGTTGTG ATTGT AACCC TC ACC CGTATC ACC AC AGAAGGGGTTGA 
GGACTCATACAAAGGTGCTACTATTGATCAGGACAGAAGCAAGTCTGTTATAACAACTTTGCCCAAT 
GACTC ACC TTTTGGC TTGGTGGGCTGGCGTGC TGCGTCTGTC TTC ATTAGAGTAGC AGAGCCTAAAG 
AAAAC ACC ACCAC TC TTC AGTTAC AAATAGCTCG AGATAAAGGACTAC TTGGGGATATTGCCATTCA 
C TTGAGAGCTC AAC C CAATTTC T T ACTGC ATGTCG ATAATC AAGCT AC TGAGAATG AAGATTATGTA 
TTGC AAGAAAC AAT AATAATAATGAAAGAAAAC ATAAAAG AAGC TC ATGCCGAAGT TTCCATTTTGC 
CGGATGAC CTTCC TGAATTGGAGGAAGGATT T ATTGTC AC TATC ACTGAGGTGAAC C TGGTGAACT C 
TG ACTTC TC TAC AGGAC AGCC AAGTGTGCGGAGGC C CGGAATGGAAATAGC TGAGAT AATGATAGAA 
GAAAATGACGATCCCAGAGGAATTTTTATGTTTCATGTTACTAGAGGCGCTGGGGAAGTTATTACTG 
CCTATGAGGTGCCTCCACCCTTGAACGTTCTTCAAGTTCCTGTAGTCCGGCTGGCTGGAAGCTTTGG 
GGC AGTAAATGTTT ATTGGAAAGCATCACC AG AC AGTGC TGGCC TGGAAGACTTTAAACCATC TC AT 
GGGATTCTTGAATT TGC AGAT AAAC AGGTTAC TGC AATGATAGAAATC ACCATAATTGATGATGCTG 
AATTTGAATTGACAGAGACGTTCAATATTTCCTTGATCAGTGTTGCTGGAGGTGGCAGACTTGGTGA 
TGATGTTGTGGTAACTGTTGTTATTCCACAAAATGATTCTCCATTTGGAGTATTTGGATTTGAAGAA 
AAGACTGTAAGTTAAACATATCAGGGGAAAGCCTTGTTTCAGGCTAGCGTTTCATGTAATTTTGAGT 


AGAAAGTGTCTCACATTTTTGTTTTGGAAGTCTTGGCCAGGCATGGTGGCTCATGCCAGTAATCCCA 


GCACTTTGGGAGGCCGCAGCGGGCAGATCACGAGGTCAGGAGATTGACACCATCCTGGCCAATATGG 


T TG AATTCCCGTCTCTAC TGAAAGTAC AAAAATTAGC TGGGC GTGGTGGCAC ATGCCTGTAT TCCC A 


G AT ACTTGGGAGGC TGAGGC AGGAGACTCGC TTGAACCC AGGAGGC AGAGGTTGC AGTGAGC TG AG A 


TCACGCCATTGCACTCCAGCCTGGCGACATAGAGAGACTCCATCTCAAAAAAAAAAAAAAAAAAAG 




ORF Start: ATG at 23 [ ]ORF Stop: TAA at 1 1537 





SEQ ID NO: 162 |3838 aa MW at 421384.3kD 


NOV39b, 
CG 150799-02 
Protein Sequence 


MVMVTFEVEGGPNPPDEDLS PVKGNITFPPGRATVI YNLTVLDDEVPENDE IFLIQLKS VEGGAE IN 
TSRNSIEI I IKKNDSPVRFLQS I YLVPEEDHI LI I P WRGKDNNGNLIGSDEYEVS I S YAVTTGNST 
AHAQQNLDFIDLQPNTTVVFPPFIHESHLKFQIVDDTTPEIAESFHIMLLKDTLQGDAVLISPSWQ 
VTIKPNDKPYGVLSFNSVLFERTVI IDEDR I SRYEE I TWRNGGTHGNVSANWVLTRNSTDPSPVTA 
DIRPSSGVLHFAQGQMLATIPLTVVDDDLPEEAEAYLLQILPHTIRGGAEVSEPAEDSDDVYGLITF 
F PMENQKIE S S PGER YLSLS FTRLGGTKGDVRLL YS VL Y I PAGAVDPLQAKEG ILNI SRRNDL I F PE 
QKTQVTTKLPIRNDAFFQNGAHFLVQLETVELLNII PLI PPI SPRFGEICNISLLVTPAIANGEIGF 
LSl^PIILHEPEDFAAEVVYIPLHRDGTIXSQATVYWSLKPSGFNSKAVTPDDIGPFNGSVLFLSGQS 
DTTINITIKGDDI PEMNETVTLSLDRVNVENQVLKSGYTSRDLI ILENDDPGGVFEFSPASRGPYVI 
KEGESVELH I IRSRGSLVKQFLHYRVEPRDSNEF YGNTGVLEFKPGERE I VITLLARLDGI PELDEH 
YWVVLSSHGERESKLGSATIVNITILKNDDPHGIIEFVSIX3LIVM 

FGDVSVSWWS PDFTQDWPVQGTVVFGDQEFSKNIT I YSLPDE IPEEMEEFTVILIiNGTGGAKVGN 
RTT ATLR I RRNDDP I YF AEPRWRVQEGETANFTVLRNG SVDVTCMVQ YATKDGKATARERDF I P VE 
KGETLIFEVGSRQQS I S IFVNEDGI PETDEPF YI I LLNSTGDTWYQYGVATVI I EANDDPNG IFSL 
EPIDKAVEEGKTNAFWILRHRGYFGSVSVSWQLFQNDSALQPGQEFYETSGTVNFMDGEEAKPIILH 
AFPDKI PEFNEFYFLKLVNI SGPGGQLAETl^QVTVMVPFlSrDDPFGVFILDPECLEREVAEDVLSED 
DMSYIT3WTILRQQGVFGDVQLGWEILSSEFPAGLPPMIDFLLVGIFPTTVHLQQHMRRHHSGTDAL 
YFTGLEGAFGTVNPKYHPSRNNT I ANFTFSAWVMPNANTNGFI I AKDDGNGS I YYGVKIQTNESHVT 
LSLHYKTLGSNATYIAKTTVMKYLEESVWLHLLI ILEDGI IEFYLDGNAMPRGIKSLKGEAITDGPG 
ILRIGAGINGNDRFTGLMQDVRSYERKLTLEEIYEIiHAMPAKSDLHPISGYLEFRQGETNKSFIISA 
RDDNDEEGEELF ILKLVSVYGGARI SEENTTARLTIQKSDNANGLFGFTGAC I PEIAEEGSTISC W 
ERTRGALDYVHVFYTISQIETIX5INYLVDDFANASGTITFLPW 

TLVSAIPGDGKLGSTPTSGASIDPEKETTDITIKASDHPYGLLQFSTGIiPPQPKDAMTLPASSVPHI 

TVEEEDGEIRLLVIRAQGLLGRVTAEFRTVSLTAFSPEDYQNVAGTLEFQPGERYKYIFINITDNSI 

PELEKSFKVELLNLEGGVAELFRVDGSGSASLGVASQILVTIAASDHAHGVFEFSPESLFVSGTEPE 

IX3YSWTLOTIRHHGTLSPVTLHWWIDSDPDGDLAFTSGNITFEIGQTSANITVEI 

FSVSVLSVSSGSLGAHINATLTVLASDDPYGIFIFSEKNRPVKVEEATQNI TLS 1 1 RLKGLMGKVL V 

SYATLDDMEKPPYFPPNLARATOGRDYIPASGFALFGANOSEATIAISILDDDEPERSESVFIELLN 
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ADVSVKFKAVPITAIAGEDYSIASSDVVLLEGETSKAVPIWINDIYPELEESFLVQLMNETTGGAR 
LGALTEAVT I IEASDDPYGLFGFQITKLIVEEPEFNSVKVNLPI IRNSGTLGNVTVQWVATINGQLA 
TGDLRWSGNVTFAPGETIQTLIiLEVLADDVPEIEEVIQVQLTDASGGGTIGLDRIANI I IPANDDP 
YGTVAFAQMVYRVQEPLERSSCANI TVRRSGGHFGRLLLF YSTSDIDWALAMEEGQDLLSYYES PI 
QGVPDPLWRTWMWSAVGEPLYTCATLCLKEQACSAFSFFSASEGPQCFWMTSWISPAVl^SDFWTY 
RKNMTRVASLFSGQAVAGSDYEPVTRQWAIMQEGDEFANLTVSILPDDFPEMDESFLISLLEVHLMN 
ISASLKNQPTIGQPNISTVVIALNGDAFGVFVIYSISPNTSEDGLFVEVQEQPQTLVELMIHRTGGS 
LGQVAVEWRWGGTATEGLDFIGAGEILTFAEGETKKTVILTILDDSEPEDDESIIVSLVYTEGGSR 
ILPSSDTVRTOILANDNVAGIVSFQTASRSVIGHEGEIL^^ 

FANFSGQLFFPEGSLNTTLFVHLLDDNIPEEKEVYQVILYDVRTQGVPPAGIALLDAQGYAAVLTVE 

ASDEPHGVLl^ALSSRFVLLQEANITIQLFIl^EFGSLGAINVTYTTOPGMLSLKNQTVGl^AEP^ 

DFVPIIGFLILEEGETAAAINITILEDDVPELEEYFLVNLTYVGLTMAASTSFPPRLDSEGLTAQVI 

IDANIXSARGVIEWQQSRFEVl^THGSLTLVAQRSREPLGHVSLFVYAQNLEAQVGLDYIFTPMILHF 

ADGERYKNVNIMILDDDI PEGDEKFQLILTNPSPGLELGKNTIALI IVLANDDGPGVLSFNNSEHFF 

LREPTALYVQESVAVLYIVREPAQGLFGTVTVQFr^EVNSSNESKDLTPSKGYIVLEEGVR^ 

I SAILDTE PEMDEYFVCTLFNP^TGGARLGVHVQTLI TVLQNQ APLGLF S I SAVENRAT S ID I EEANR 

TVYLNVSRTNG IDLAVSVQWE WSETAFGMRGOTVWSVFQSFI^ES ASGWCFFTLENL I YGIJMLRK 

SSVTVYRWQGIFIPVEDLNIENPKTCEAFNIGFSPYFVITHEERNEEKPSLNSVFTFTSGFKLFLVQ 

TI IILESSQWYFTSDSQDYLI IASQRDDSELTQVFRWNGGSFVLHQKLPVRGVL WALFNKGGSVF 

LAISQANARLNSLLFRWSGSGFIOTQEVPVSGTTEVEALSSANDIYLIFAKNW 

MGQSSFRYFQSVDFAAVNRIHSFTPASGIAHILLIGQDMSALYCWNSERNQFSFVLEVPSAYDVASV 

TVKSLNSSKNLIALVGAHSHIYELAYISSHSDFIPSSGELIFEPGEREATIAVNILDDTVPEKEESF 

KVQLKNPKGGAEIGINDSVTITILSNDDAYGIVAFAQNSLYKQVEEMEQDSLVTLNVERLKGTYGRI 

TIAWEADGSISDIFPTSGVILFTEGQVLSTITLTILADNIPELSEWIVTLTRITTEGVEDSYKGAT 

IDQDRSKSVITTLPNDSPFGLVGWRAASVFIRVAEPKENTTTLQLQIARDKGLLGDIAIHLRAQPNF 

LLHVDNQATENEDYVLQETI I IMKENIKEAHAEVS ILPDDLPELEEGF IVTITEVNIiVNSDFSTGQP 

SVRRPGME I AEIMI EENDDPRGIFMFHVTRGAGEVITAYEVPPPLNVLQVPVVRLAGSFGAVNVYWK 

AS PD S AGLEDFKPSHG ILEFADKQVTAMI E I TI IDDAEFELTETFNI S L I SVAGGGRLGDDWVTW 

I PQNDS PFGVFGFEEKTVS 





SEQIDNO: 163 |5102bp 




NOV39c, 
CG150799-03 
DNA Sequence 


CAGGGAAAAGGGAACCTATGGAATGGTCATGGTGACTTTTGAGGTAGAGGGTGGCCCAAATCCCCCT 
GATGAAGATTTGAGTCCAGTTAAAGGAAATATCACCTTTCCCCCTGGCAGAGCAACAGTAATTTATA 
AC TTGACAGTACTCGATGACGAGGTACC AGAAAATGATGAAATAT TTTT AATTC AAC TGAAAAGTGT 
AGAAGGAGGAGCTGAGATTAACACCTCTAGGAATTCCATTGAGATCATCATTAAGAAAAATGATAGT 
CCCGTGAGATTCCTTCAGAGTATTTAT T TGGTTCCTGAGGAAG ACCAC ATAC TCATAATTCC AGTAG 
TTCGTGG AAAGGAC AACAATGGAAATCTGAT TGGATCTGATG AAT ATGAGGT TTCAATC AGT TATGC 
TGTC AC AACTGGGAATTC C AC AGCACATGCCC AGCAAAATCTGGAC TTC ATTGATC TTCAGCCAAAC 
AC AAC TGTTGT TTTTCCACC TTTTATTC ATGAATCTCACTTG AAATTTC AAATAGTTGATGAC ACC A 
CACCGGAGATTGCTGAATCGTTTCACATTATGTTACTAAAAGATACCTTACAGGGAGATGCTGTGCT 
AATAAGCC C TTCTGTTGTACAAGTC ACC ATTAAGC C AAATG ATAAACC TTATGGAGTC CTTTC ATTC 
AACAGTGTTTTGTTTGAAAGGACAGTTATAATTGATGAAGATAGAATATCAAGATATGAAGAAATCA 
CAGTGGTTAGAAATGG AGGAACCCATGGGAATGTC TC TGCG AATTGGGTGTTGACACGGAAC AGC AC 
TGATCCCTCACCAGTAACAGCAGATATCAGACCGAGCTCTGGAGTTCTCCATTTTGCACAAGGGCAG 
ATGTTGGCAACAATTCCTCTTACTGTGGTTGATGATGATCTTCCAGAAGAGGCAGAAGCTTATCTAC 
TTCAAATTCTGCCTCATACAATACGAGGAGGTGCAGAAGTGAGCGAGCCAGCGGAGGATAGTGATGA 
TGTCTATGGCCTAATAACATTTTTTCCTATGGAAAACCAGAAGATTGAAAGCAGCCCAGGTGAACGA 
TACTTATCCTTGAGTTTTAC AAGAC TAGGAGGGAC TAAAGGAGATGTGAGGTTGCTTTATTC TGTAC 
T TTAC ATTCCTGCTGGAGCTGTGGACCC C T TGC AAGC AAAAGAAGGC ATC TTAAATATATC AAGGAG 
AAATGACCTCATTTTTCCAGAGCAAAAAACTCAAGTCACTACAAAATTACCAATAAGAAATGATGCA 
TTCTTTCAAAATGGAGCTCACTTTCTAGTACAGTTGGAAACTGTGGAGTTGTTAAACATAATTCCTC 
TAATCCCACCCATAAGCCCTAGATTTGGGGAAATCTGCAATATTTCTTTACTGGTTACTCCAGCCAT 
TGCAAATGGAGAAATTGGCTTTCTCAGCAATCTTCCAATTATTTTGCATGAACCAGAAGATTTTGCT 
GCTGAAGTGGTATACATTCCCTTACATCGGGATGGAACTGATGGCCAGGCTACTGTCTACTGGAGTT 
TGAAGCCC TCTGGC TTTA^LTTC AAAAGC AGTG ACCCCGGATG ATATAGGC CC CTTTAATGGCTCTGT 
TTTGTTTTTATCTGGGCAAAGTGACACAACAATCAACATTACTATCAAAGGTGATGACATACCGGAA 
ATGAATGAAACTGTAACACTTTCTCTAGACAGGGTTAACGTGGAAAACCAAGTGCTGAAATCTGGAT 
ATACTAGCCGTGACCTAATTATTTTGGAAAATGATGACCCTGGGGGAGTTTTTGAATTTTCTCCTGC 
TTCCAGAGGACCCTATGTTATAAAAGAAGGAGAATCTGTAGAGCTCCACATCATCCGATCAAGGGGG 
TCCCTTGTTAAGCAGTTTCTACACTACCGAGTAGAGCCAAGAGATAGCAATGAATTCTATGGAAACA 
CGGGAGTACTAGAATTTAAACCTGGAGAAAGGGAGATAGTGATCACCTTGCTAGCAAGATTGGATGG 
GATACCAGAGTTGGATGAACACTACTGGGTGGTCCTCAGCAGCCACGGAGAACGGGAAAGCAAGTTG 
GGAAGTGCCACCATTGTCAATATAACGATTCTGAAAAATGATGATCCTCATGGCATTATAGAATTTG 
TTTCTGATGGTCTAATTGTGATGATAAATGAAAGCAAAGGAGATGCTATCTATAGTGCTGTTTATGA 
TGTAGTAAGAAATCGAGGCAACTTTGGTGATGTTAGTGTATCATGGGTGGTTAGTCCAGACTTTACA 
CAAGATGTATTTCCTGTACAAGGGACTGTTGTCTTTGGAGATCAGGAATTTTCAAAAAATATCACCA 
TTTACTCCCTTCCAGATGAGATTCCAGAAGAAATGGAAGAATTTACCGTTATCCTACTGAATGGCAC 
TGGAGGAGCTAAAGTGGGAAATAGAACAACTGCAACTCTGAGGATTAGAAGAAATGATGACCCCATT 
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TATTTTGCAGAACCTCGTGTAGTGAGGGTTCA^ 


'1 


ATGGATCTGTTGATGTGACTTGCATGGTCCAGTATGCTACCAAGGATGGGAAGGCTACTGCAAGAGA 
GAGAGATTTCATTCCTGTTGAAAAAGGAGAAACGCTCATTTTTGAGGTTGGAAGTAGACAGCAGAGC 
ATATCCATATTTGTTAATGAAGATGGTATCCCGGAAACAGATGAGCCCTTTTATATAATCCTCTTGA 
ATTCAACAGGTGATACAGTAGTATATCAATATGGAGTAGCTACAGTAATAATTGAAGCTAATGATGA 
CCCAAATGGCATTTTTTCTCTGGAGCCCATAGACAAAGCAGTGGAAGAAGGAAAGACTAATGCATTT 
TGGATTTTGAGGCACCGAGGATACTTTGGTAGTGTTTCTGTATCTTGGCAGCTCTTTCAGAATGATT 
C TGC TTTGCAGC CTGGG C AGGAGTTCT ATGAAACTTCAGGAACTGTTAAC TTC ATGG ATGGAGAAG A 
AGCAAAACCAATCATTCTCCATGCTTTTCCAGATAAAATTCCTGAATTCAATGAATTTTATTTCCTA 
AAACTTGTAAACATTTCAGGTCCTGGGGGCCAGCTAGCAGAAACCAACCTCCAGGTGACAGTAATGG 
TTCCATTCAATGATGATCCCTTTGGAGTTTTTATCTTGGATCCAGAGTGTTTAGAGAGAGAAGTGGC 
AGAAGATGTCCTGTCTGAAGATGATATGTCTTATATTACCAACTTCACCATTTTGAGGCAGCAGGGT 
GTGTTTGGTGATGTACAACTGGGCTGGGAAATACTGTCCAGTGAGTTCCCTGCTGGTTTGCCACCAA 
TGATAGATTTTTTACTGGTTGGAATTTTCCCCACCACCGTGCATTTACAACAGCACATGCGGCGTCA 
CCACAGTGGAACGGATGCTTTGTACTTTACCGGACTAGAGGGTGCATTTGGGACTGTTAATCCAAAA 
TACCATCCCTCCAGGAATAATACAATTGCCAACTTTACATTCTCAGCTTGGGTAATGCCCAATGCCA 
ATACGAATGGATTCATTATAGCGAAGGATGACGGTAATGGAAGCATCTACTACGGGGTAAAAATACA 
AACAAACGAATCCCATGTGACACTTTCCCTTCATTATAAAACCTTGGGTTCCAATGCTACATACATT 
GC CAAGAC AAC AGTCATGAAATATTTAGAAG AAAGTGTTTGGC TTC ATCTACTAATTATCC TGG AGG 
ATGGTATAATCGAATTC TACC TGGATGGAAATGCAATGCCC AGGGG AATCAAGAGTCTGAAAGGAGA 
AGCCATTACTGACGGTCCTGGGATACTGAGAATTGGAGCAGGGATAAATGGCAATGACAGATTTACA 
GGTCTGATGCAGGATGTGAGGTCCTATGAGCGGAAACTGACGCTTGAAGAAATTTATGAACTTCATG 
CCATGCCCGCAAAAAGTGATTTACACCCAATTTCTGGATATCTGGAGTTCAGACAGGGAGAAACTAA 
C AAATCATTC ATT ATTTC TGC AAGAGATGAC AATGACGAGG AAGG AG AAGAATTATTC ATTC TTAAA 
CTAGTTTCTGTATATGGAGGAGCTCGTATTTCGGAAGAAAATACTACTGCAAGATTAACAATACAAA 
AAAGTGACAATGCAAATGGCTTGTTTGGTTTCACAGGAGCTTGTATACCAGAGATTGCAGAGGAGGG 
ATCAACCATTTCTTGTGTGGTTGAGAGAACCAGAGGAGCTCTGGATTATGTGCATGTTTTTTACACC 
ATTTCACAGATTGAAACTGATGGCATTAATTACCTTGTTGATGACTTTGCTAATGCCAGTGGAACTA 
TTACATTCCTTCCTTGGCAGAGATCAGAGCTTTTGATTGAAGTGTCGCTTCCCATTATTATTTACAA 
CTGTAACTGATACATTAG7VATTTGCTTCAAACATGTCTGCTGTAAAACCTTTATCAGGTTCTGAATA 


TATATGTTCTTGATGATGATATTCCTGAACTTAATGAGTATTTCCGTGTGACATTGGTTTCTGCAAT 


TCCTGGAGATGGGAAGCTAGGCTCAACTCCTACCAGTGGTGCAAGCATAGATCCTGAAAAGGAAACG 


ACTGATATCACCATCAAAGCTAGTGATCATCCATATGGCTTGCTGCAGTTCTCCACAGGGCTGCCTC 


CTCAGCCTAAGGACGCAATGACCCTGCCTGCAAGCAGCGTTCCACATATCACTGTGGAGGAGGAAGA 


TGGAGAAATCAGGTTATTGGTCATCCGTGCACAGGGACTTCTGGGAAGGGTGACTGCGGAATTTAGA 


ACAGTGTCCTTGACAGCATTCAGTCCTGAGGATTACCAGAATGTTGCTGGCACATTAGAATTTCAAC 


C^GGAGAAAGATATAAATACATTTTCATAAAC^TCACTGATAATTCTATTCCTGAACTGGAAAAATC 


TTT TAAAGTTGAGTTGTTAAACTTGGAAGGAGGAGCTCTGC TAGATCTATCTACAGATAT AACGC TG 


TAAAATC TGGTCCTTTTGGATGATCTATAATGAGTTGATTATTAAT AAAAGAAGTCAAC AATACC TT 


AAAAAAAAAA 




ORF Start: ATG at 23 j |ORF Stop: TGA at 4430 





SEQ ID NO: 164 


1469 aa JMW at 162809.6kD 


NOV39c, 
CG150799-03 
Protein Sequence 


MVMVTFEVEGGPNPPDEDLS PVKGNITFPPGRATVI YNLTVLDDEVPENDE IFLI QLKSVEGGAEIN 
TSRNSIEI IIKKNDSPVRFLQSIYLVPEEDHILII PVVRGKDl^GNLI^ 

AHAQQNLDFIDLQPOTTVWPPFIHESHLKFQIVDDTTPEIAESFHIMLLKiyrLQGDAVLISPSVVQ 
VTIKPNDKPYGVLSFNSVLFERTVIIDEDRISRYEEITWRNGGTHGOT 

DIRPSSGVLHFAQGQMLATIPLTVVDDDLPEEAEAYLLQILPHTIRGGAEVSEPAEDSDDVYGLITF 
FPMENQKI ES SPGERYLSLSFTRLGGTKGDVRLLYSVLYIPAGAVDPLQAKEGILNI SRRNDLI FPE 
QKTQWTKLP IRNDAFFQNGAHFLVQLETVELLNI IPLI PP ISPRFGEICNI SLLVT PAI ANGEIGF 
LSNLPI ILHEPEDFAAEWYI PLHRDGTDGQATVYWSLKPSGFNSKAVTPDDIGPFNGS VLFLSGQS 
DTTINITIKGDDIPEMNETVTLSLDRVNVENQVLKSGYTSRDLIILENDDPGGVFEFSPASRGPYVI 
KEGESVELHI IRSRGSLVKQFLHYRVEPRDSNEFYGNTGYLEFKPGEREIVITLLARLDGIPELDEH 
YWWLS SHGERE SKLG S AT I VNI T I LKNDDPHGI I EFVSDGL I VMINESKGDAI YS AVYDWRNRGN 
FGDVSVSWWS PDFTQDVFPVQGTWFGDQEFSKNIT I YSLPDE I PEEMEEFTVILLNGTGGAKVGN 
RTTATLRIRRNDD P I YF AE PRVVRVQEGETANFTVLRNGSVDVTCMVQYATKIX3KATARERDFI PVE 
KGETLIFEVGSRQQS I S I FVNEDGIPETDEPFYI ILLNSTGDTWYQYGVATVI I EANDDPNGI FSL 
EPIDKAVEEGKTNAFWILRHRGYFGSVSVSWQLFQNDSALQPGQEFYETSGTVNFMIX3EEAKPIILH 
AFPDKIPEFNEFYFLKLWISGPGGQLAETIfoQVTVMVPFlTODPFGOT 

DMSYITNFTILRQQGVFGDVQLGWEILSSEFPAGLPPMIDFLLVGIFPTTVHLQQHMRRHHSGTDAL 
YFTGLEGAFGTVNPKYHPSRNNTIANFTFSAWVMPNANT^^ 

LSLHYKTLGSNATYIAKTTVMKYLEESVWLHLLIILEDGIIEFYLDGNAMPRGIKSLKGEAITDGPG 
ILRIGAGINGNDRFTGLMQDTOSYERKLTLEEI YELHAMPAKSDLHPI SGYLEFRQGETNKS FI ISA 
RDDNDEEGEELFILKLVSVYGGARISEENTTARLTIQKSDNANGLFGFTGACIPEIAEEGSTISCVV 
ERTRGALDYVHVF YT I S Q I ETDGI NYLVDDFANASGT I TFL P WQRS ELL I EVS LP 1 1 1 YNCN 
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SEQIDNO:165 


8350 bp 1 


NOV39d, 
CG150799-01 
DNA Sequence 


C AGGGAAAAGGGAACC TATGGAATGGTC ATGGTGAC TTTTGAGGTAGAGGGTGGCCC AAATCCCCCT 
GATGAAGATTTGAGTCCAGTTAAAGGAAATATCACCTTTCCCCCTGGCAGAGCAACAGTAATTTATA 
ACTTGACAGTACTCGATGACGAGGTACCAGAAAATGATGAAATATTTTTAATTCAACTGAAAAGTGT 
AGAAGGAGGAGCTGAGATTAACACCTCTAGGAATTCCATTGAGATCATCATTAAGAAAAATGATAGT 
CC CGTGAGATTCC TTC AG AGTATTTATTTGGTTC C TGAGGAAGACC AC ATACTC AT AATTCCAGTAG 
TTCGTGGAAAGGACAACAATGGAAATCTGATTGGATCTGATGAATATGAGGTTTCAATCAGTTATGC 
TGTCACAACTGGGAATTCCACAGCACATGCCCAGCAAAATCTCGACTTCATTGATCTTCAGCCAAAC 
AC AACTGTTGTTTTTC C ACC TT TT ATTC ATGAATCTC ACTTGAAATTTC AAATAGT TGATGAC ACC A 
CACCGGAGATTGCTGAATCGTTTCACATTATGTTACTAAAAGATACCTTACAGGGAGATGCTGTGCT 
AATAAGC CC TTCTGTTGTACAAGTC AC C ATTAAGCC AAATG ATAAACCTTATGGAGTCC TTTCATTC 
AACAGTGTTTTGTTTGAAAGGACAGTTATAATTGATGAAGATAGAATATCAAGATATGAAGAAATCA 
CAGTGGTTAGAAATGGAGGAACCCATGGGAATGTCTCTGCGAATTGGGTGTTGACACGGAACAGCAC 
TGATCCCTCACCAGTAACAGCAGATATCAGACCGAGCTCTGGAGTTCTCCATTTTGCACAAGGGCAG 
ATGTTGGCAACAATTCCTCTTACTGTGGTTGATGATGATCTTCCAGAAGAGGCAGAAGCTTATCTAC 
TTCAAATTCTGCCTCATACAATACGAGGAGGTGCAGAAGTGAGCGAGCCAGCGGAGGATAGTGATGA 
TGTCTATGGCCTAATAACATTTTTTCCTATGGAAAACCAGAAGATTGAAAGCAGCCCAGGTGAACGA 
TACTTATCCTTGAGTTTTACAAGACTAGGAGGGACTAAAGGAGATGTGAGGTTGCTTTATTCTGTAC 
TTTAC ATTCC TGC TGGAGCTGTGG ACC CCTTGCAAGC AAAAGAAGGC ATCTTAAATATATCAAGGAG 
AAATGACCTCATTTTTCCAGAGCAAAAAACTCAAGTCACTACAAAATTACCAATAAGAAATGATGCA 
TTC TTTC AAAATG GAG CTC AC TTTC T AGTAC AGTTGGAAACTGTGGAGTTGTTAAAC ATAATTCCTC 
TAATCCCACC CATAAGCCCTAGATTTGGGGAAATCTGCAATATTTC TTTACTGGTTAC TCCAGCC AT 
TGCAAATGGAGAAATTGGCTTTCTCAGCAATCTTCCAATTATTTTGCATGAACCAGAAGATTTTGCT 
GCTGAAGTGGTATACATTCCCTTACATCGGGATGGAACTGATGGCCAGGCTACTGTCTACTGGAGTT 
TGAAGCCCTCTGGCTTTAATTCAAAAGCAGTGACCCCGGATGATATAGGCCCCTTTAATGGCTCTGT 
TTTGTTTTTATCTGGGCAAAGTGACACAACAATCAACATTACTATCAAAGGTGATGACATACCGGAA 
ATGAATGAAACTGTAACACTTTCTCTAGACAGGGTTAACGTGGAAAACCAAGTGCTGAAATCTGGAT 
ATACTAGCCGTGACCTAATT ATT TTGG AAAATG ATGACCC TGGGGGAGTTTTTGAATT TTCTCC TGC 
TTCCAGAGGACCCTATGTTATAAAAGAAGGAGAATCTGTAGAGCTCCACATCATCCGATCAAGGGGG 
TC CCTTGTTAAGC AGTTTC T AC AC T AC CGAGTAGAGC CAAGAGATAGC AATGAATTCT ATGGAAAC A 
CGGGAGTACTAGAATTTAAACCTGGAGAAAGGGAGATAGTGATCACCTTGCTAGCAAGATTGGATGG 
GATACCAGAGTTGGATGAAC ACTACTGGGTGGT CC TC AGC AGCC ACGGAGAACGGGAAAGCAAGTTG 
GGAAGTGCCACCATTGTCAATATAACGATTCTGAAAAATGATGATCCTCATGGCATTATAGAATTTG 
TTTCTGATGGTCTAATTGTGATGATAAATGAAAGCAAAGGAGATGCTATCTATAGTGCTGTTTATGA 
TGTAGTAAGAAATCGAGGCAACTTTGGTGATGTTAGTGTATCATGGGTGGTTAGTCCAGACTTTACA 
CAAGATGTATTTCCTGTACAAGGGACTGTTGTCTTTGGAGATCAGGAATTTTCAAAAAATATCACCA 
TTTACTCCCTTCCAGATGAGATTCCAGAAGAAATGGAAGAATTTACCGTTATCCTACTGAATGGCAC 
TGGAGGAGCTAAAGTGGGAAATAGAACAACTGCAACTCTGAGGATTAGAAGAAATGATGACCCCATT 
TATTTTGCAGAACCTCGTGTAGTGAGGGTTCAGGAAGGTGAGACTGCCAACTTTACAGTTCTCAGAA 
ATGGATC TGTTGATGTGAC TTGC ATGGTCCAGTATGC TAC C AAGGATGGGAAGGCTAC TGC AAGAGA 
GAGAGATTTCATTCCTGTTGAAAAAGGAGAAACGCTCATTTTTGAGGTTGGAAGTAGACAGCAGAGC 
ATATCCATATTTGTTAATGAAGATGGTATCCCGGAAACAGATGAGCCCTTTTATATAATCCTCTTGA 
ATTCAAC AGGTGATACAGTAGTATATC AATATGG AGTAGC TACAGTAATAATTGAAGC TAATGATGA 
CCCAAATGGC ATTTTTTC TC TGGAGCCC ATAG AC AAAGC AGTGGAAG AAGGAAAGACTAATGC ATTT 
TGG ATTTTGAGGC ACCGAGGATAC TTTGGTAGTGT TTCTGTATCTTGGC AGCTC TTTC AGAATGATT 
CTGCTTTGCAGCCTGGGCAGGAGTTCTATGAAACTTCAGGAACTGTTAACTTCATGGATGGAGAAGA 
AGCAAAACCAATCATTCTCCATGCTTTTCCAGATAAAATTCCTGAATTCAATGAATTTTATTTCCTA 
AAACTTGTAAAC ATTTC AGGTCC TGGGGGC C AGC TAGCAGAAACCAACCTC CAGGTGAC AGTAATGG 
TTCCATTCAATGATGATCCCTTTGGAGTTTTTATCTTGGATCCAGAGTGTTTAGAGAGAGAAGTGGC 
AGAAGATGTCCTGTCTGAAGATGATATGTCTTATATTACCAACTTCACCATTTTGAGGCAGCAGGGT 
GTGTTTGGTGATGTACAACTGGGCTGGGAAATACTGTCCAGTGAGTTCCCTGCTGGTTTGCCACCAA 
TGATAGATTTTTTACTGGTTGGAATTTTCCCCACCACCGTGCATTTACAACAGCACATGCGGCGTCA 
CCAC AGTGGAACGGATGC TTTGTAC TTTACCGGACTAGAGGGTGCATTTGGG ACTGTTAATCC AAAA 
TACCATCCCTCCAGGAATAATACAATTGCCAACTTTACATTCTCAGCTTGGGTAATGCCCAATGCCA 
ATACGAATGGATTCATTATAGCGAAGGATGACGGTAATGGAAGCATCTACTACGGGGTAAAAATACA 
AACAAACGAATCCCATGTGACACTTTCCCTTCATTATAAAACCTTGGGTTCCAATGCTACATAC^TT 
GCCAAGACAACAGTCATGAAATATTTAGAAGAAAGTGTTTGGCTTCATCTACTAATTATCCTGGAGG 
ATGGTATAATCGAATTCTACCTGGATGGAAATGCAATGCCCAGGGGAATCAAGAGTCTGAAAGGAGA 
AGCCATTACTGACGGTCCTGGGATACTGAGAATTGGAGCAGGGATAAATGGCAATGACAGATTTACA 
GGTCTGATGCAGGATGTGAGGTCCTATGAGCGGAAACTGACGCTTGAAGAAATTTATGAACTTCATG 
CCATGCCCGCAAAAAGTGATTTACACCCAATTTCTGGATATCTGGAGTTCAGACAGGGAGAAACTAA 
CAAATCATTCATTATTTCTGCAAGAGATGACAATGACGAGGAAGGAGAAGAATTATTCATTCTTAAA 
CTAGTTTCTGTATATGGAGGAGCTCGTATTTCGGAAGAAAATACTACTGCAAGATTAACAATACAAA 
AAAGTGACAATGCAAATGGCTTGTTTGGTTTCACAGGAGCTTGTATACCAGAGATTGCAGAGGAGGG 
ATCAACCATTTCTTGTGTGGTTGAGAGAACCAGAGGAGCTCTGGATTATGTGCATGTTTTTTACACC 
ATTTCACAGATTGAAACTGATGGCATTAATTACCTTGTTGATGACTTTGCTAATGCCAGTGGAACTA 
TTACATTCCTTCCTTGGCAGAGATCAGAGGTTCTGAATATATATGTTCTTGATGATGATATTCCTGA 
ACTTAATGAGTAT TTCCGTGTG ACATTGGTTTC TGCAATTCCTGG AG ATGGGAAGCTAGGC TC AAC T 
CCTACCAGTGGTGCAAGCATAGATCCTGAAAAGGAAACGACTGATATCACCATCAAAGCTAGTGATC 
ATCCATATGGCTTGCTGCAGTTCTCCACAGGGCTGCCTCCTCAGCCTAAGGACGCAATGACCCTGCC 
TGCAAGCAGCGTTCCACATATCACTGTGGAGGAGGAAGATGGAGAAATCAGGTTATTGGTCATCCGT 
GCACAGGGACTTC TGGGAAGGGTG ACTGCGGAATTTAGAAC AGTGTCC TTGACAGCATTCAGTCCTG 
AGGATTACCAGAATGTTGCTGGCACATTAGAATTTCAACCAGGAGAAAGATATAAATACATTTTCAT 
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AAACATCACTGATAATTCTATTCCTGAACTC 

GGAGGAGTAGCTGAACTCTTTAGGGTTGATGGAAGTGGTAGTGCCAGTCTAGGAGTGGCTTCCCAAA 
TTCTAGTGACAATTGCAGCCTCTCACCACGCTCATGGCGTATTTGAATTTAGCCCTGAGTCACTCTT 
TGTCAGTGGAACTGAACCAGAAGATGGGTATAGCACTGTTACATTAAATGTTATAAGACATCATGGA 

C TGGC AAC ATC AC ATTTG AGATTGGGC AGACGAGCGCC AATATC AC TGTGGAG ATATTGCC TGACGA 
AGACCCAGAACTGGATAAGGCATTCTCTGTGTCAGTCCTCAGTGTTTCCAGTGGTTCTTTGGGAGCT 
CATATTAATGCCACGTTAACAGTTTTGGCTAGTGATGATCCATATGGGATATTCATTTTTTCTGAGA 
AAAAC AGACC TGTTAAAGTTG AGG AAGC AAC CC AG AAC ATC ACACTATCAATAATAAGGTTGAAAGG 
CCTCATGGGAAAAGTCCTTGTCTCATATGCAACACTAGATGATATGGAAAAACCACCTTATTTTCCA 
CC TAATTTAGCGAG AGC AAC TCAAGGAAG AG AC TATATACC AGC TTCTGGATTTGC TC TTTTTGG AG 
CTAATCAGAGTGAGGCAACAATAGCTATTTCAATTTTGGATGATGATGAGCCAGAAAGGTCCGAATC 
TGTCTTTATCGAACTACTCAACTCTACTTTAGTAGCGAAAGTACAGAGTCGTTCAATTCCAAATTCT 
C C ACGTCT TGGGCC TAAGGTAG AAACT ATTG CGC AACTAATT ATC ATTGCCAATGATGATGC ATTTG 
GAACTC TTC AGCTCTC AGC AC C AATTGTCCGAGTGGCAGAAAAT C ATGTTGGAC CC ATTATC AATGT 
GACTAGAACAGGAGGAGCATTTGCAGATGTCTCTGTGAAGTTTAAAGCTGTGCCAATAACTGCAATA 
GCTGGTGAAGATTATAGTATAGCTTCATCAGATGTGGTCTTGCTAGAAGGGGAAACCAGTAAAGCCG 
TGCCAATATATGTCATTAATGATATCTATCCTGAACTGGAAGAATCTTTTCTTGTGCAACTGATGAA 
TGAAACAACAGGAGGAGCCAGACTAGGGGCTTTAACAGAGGCAGTCATTATTATTGAGGCCTCTGAT 
GACC CC T ATGGATTATTTGGTTTTC AGATTACTAAACTTATTGTAGAGGAACCTGAGTTTAAC TC AG 
TGAAGGTAAACCTGCCAATAATTC GAAATTC TGGGACACTCGGC AATGTTACTGTTCAGTGGGTTGC 
CACCATTAATGGACAGCTTGCTAC TGGCGACCTGCGAGTTGTC TCAGGTAATGTGACCTTTGCCCCT 
GGGGAAACCATTCAAACCTTGTTGTTAGAGGTCCTGGCTGACGACGTTCCGGAGATTGAAGAGGTTA 
TCCAAGTGCAACTAACTGATGCCTCTGGTGGAGGTACTATTGGGTTAGATCGAATTGCAAATATTAT 
TATTCC TG CC AATGATG ATC CTTATGGT ACAGTAGCCTTTGC TC AGATGGTTTATCGTGTTCAAGAG 
CCTCTGGAAAGAAGTTCCTGTGCTAATATAACTGTCAGGCGAAGCGGAGGGCACTTTGGTCGGCTGT 
TGTTGTTC TAC AGTACTTC CGACATTGATGTAGTGGCTCTGGC AATGG AGG AAGGTCAAGATTTACT 
GTCCTACTATGAATCTCCAATTCAAGGGGTGCCTGACCCACTTTGGAGAACTTGGATGAATGTCTCT 
GCCGTGGGGGAGCCCCTGTATACCTGTGCCACTTTGTGCCTTAAGGAACAAGCTTGCTCAGCGTTTT 
C ATTTTTC AGTGC TTCTGAGGGTCC CC AGTGTTTCTGGATG AC ATC ATGG ATC AGCCC AGCTGTGAA 
CAATTCAGACTTCTGGACCTACAGGAAAAACATGACCAGGGTAGCATCTCTTTTTAGTGGTCAGGCT 
GTGGCTGGGAGTGACTATGAGCCTGTGACAAGGCAATGGGCCATAATGCAGGAAGGTGATGAATTCG 
CAAATCTCACAGTGTCTATTCTTCCTGATGATTTCCCAGAGATGGATGAGAGTTTTCTAATTTCTCT 
CCTTGAAGTTCACCTCATGAACATTTCAGCCAGTTTGAAAAATCAGCCAACCATAGGACAGCCAAAT 
ATTTCTACAGTTGTCATAGCACTAAATGGTGATGCCTTTGGAGTGTTTGTGATCTACAATATTAGTC 
C C AATAC TTCCGAAGATGGC TTATTTGTTGAAGT TCAGG AGC AGCCCC AAACC T TGGTGGAGCTGAT 
GATACACAGGACAGGGGGCAGCTTAGGTCAAGTGGCAGTCGAATGGCGTGTTGTTGGTGGAACAGCT 
AC TGAAGGTTTAGATTTTATAGGTGC TGGAG AGATTCTG ACC TT TGCTGAAGGTGAAACCAAAAAGA 
C AGTCATTTTAACC ATC TTGG ATG ACTCTGAACC AGAGGATGAC G AAAGTATCAT AGTT AGTTTGGT 
GTAC ACTGAAGGTGGAAGTAGAATTTTGCCAAGC TCCGACAC TGTT AGAGTGAAC ATTTTGGCCAAT 
GAC AATGTGGC AGGAAT TGTTAGC TTTC AG AC AGC TTCCAG ATC TGTC ATAGGTC ATGAAGGAGAAA 
TTTTAC AATTCC ATGTGATAAGAAC TTTCCC TGGTCGAGG AAATGTTACTGTTAACTGGAAAATTAT 
TGGGCAAAATCTAGAACTCAATTTTGCTAACTTTAGCGGACAACTTTTCTTTCCTGAGGGGTCGTTG 
AATACAACATTGTTTGTGCATTTGTTGGATGACAACATTCCTGAGGAGAAAGAAGTATACCAAGTCA 
TT CTGTATGATGTC AGG AC AC AAGGAGTTCC ACC AGCCGGAATCGCCCTGCTTG ATGC TCAAGGATA 
TGC AGC TGTCC TCAC AGTAGAAGCCAGTG ATG AACC AC ATGG AG TTTTAAATT TTGCTCTTTC ATC A 
AGATTTGTGTTACTACAAGAGGCTAACATAACAATTCAGCTTTTCATCAACAGAGAATTTGGATCTC 
TAGGAGC TATC AATGTC AC ATAT ACC ACGGTTCC TGGAATGCTGAGTCTGAAG AACCAAAC AGTAGG 
AAACCTAGCAGAGCCAGAAGTTGATTTTGTCCCTATCATTGGCTTTCTGATTTTAGAAGAAGGGGAA 
AC AGC AGCAGCC ATC AAC ATTACC ATTCTTGAGG ATGATGT AC C AGAGC TAG AAG AATATTTCCTGG 
TGAATTTAACTTACGTTGGACTTACCATGGCTGCTTCAACTTCATTTCCTCCCAGACTAGGTATGAG 
GGGTTTCTTGTTTGTTTCTTTTTGCTCACTTCAAATGAAATG AAGAAACTTCATTTTTGAATCAGAA 
GTGATCATTGTGCTGTTTTGTTAATCTTAGCTATGTGTTAAA 



jORF 'Start : ATGat23 



ORF Stop: TGAat 8282 





SEQIDNO: 166 


2753 aa |MW at 301743.8kD 


NOV39d, 
CG150799-01 
Protein Sequence 


MVWTFEVEGGPNPPDEDLSPVKGNITFPPGRATVI^^TVLDDEVPENDEIFLIQLKSVEGGAEI 
TSRNS IEI I IKKNDSPVRFLQSI YLVPEEDHILI I PWRGKDNNGNLIGSDEYEVSISYAVTTGNST 
AHAQQNLDFIDLQPNTTWFPPFI HESHLKFQI VDDTTPEI AESFHIMLLKDTLQGDAVLI S PS WQ 
VTIKPNDKPYGVLSFNS VLFERTVI IDEDRI SRYEEITWRNGGTHGNVSANWVLTRNSTDPSPVTA 
DIRPSSGVLHFAQGQMLATIPLTVVDDDLPEEAEAYLLQILPHTIRGGAEVSEPAEDSDDVYGLITF 
FPMENQKI ESS PGERYLSIiS FTRLGGTKGDVRLL YSVLYI PAGAVDPLQAKEGILNISRRNDLIFPE 
QKTQVTTKLPIR1TOAFFQNGAHFLVQLETVELLNIIPLIPPISPRFGEICNISLLVTPAIANGEIGF 
LSNLPIILHEPEDFAAEVVYIPLHMX5TIX3QATVYWSLKPSGFNSKAVTPDDIGPFNGSVLFLSGQS 
DTT INI T IKGDD I PEMNETVTL SLDRVNVENQVLKSG YTSRDL I ILENDDPGG VFEFSPAS RGP YVI 
KEGESVELHIIRSRGSLVKQFLHYRVEPRDSNEFYGNTG^LEFKPGEREIVITLLARLDGIPELDEH 
YWVVLSSHGERESKLGSATIVNITILKNDDPHGIIEFVSDGLIVMINESKGDAIYSAVYDVVRNRGN 
FGDVSVSWWS PDFTODVF P VOGTWFGDOEFSKNIT I YSL PDE I P EEMEEFTVI LLNG^TGGAKVGN 
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RTOATLRIRRNDDPIYFAEPRVVRVQEGETANF 

KGETLIFEVGSRQQSISI FVNEDG IPETDEPFYI ILLNSTGDTWYQYGVATVT IEANDDPNGI FSL 
EPIDKAVEEGKTNAFWILRHRGYFGSVSVSWQLFQ1TOSALQPGQEFYETSGTVNFMDGEEAKPIIM 
AFPDKI PEFNEFYFLKLVNI SGPGGQLAETNLQVTVMVPFNDDPFGVF ILDPECLEREVAEDVLSED 
DMSYITNFTILRQQGVFGDVQMWEILSSEFPAGLPPMIDFLLVGIFPTTVHLQQHMRRHHSGTO 
YFTGLEGAFGTVNPKYHPSRNOTIAOTTFSAWVOT^ 

LSLHYKTLG SNAT Y I AKTTVMKYLEES VWLHLIi 1 1 LEDG 1 1 EFYLDGNAMPRG IKSLKGEAI TDG PG 
ILRIGAGINGNDRFTGLMQDVRSraRKLTLEEIYELHAMPAKSDLHPISGYLEFRQGETNKSFIISA 
RDDNDEEGEELFILKLVSVYGGARI SEENTTARLTIQKSDNANGLFGFTGAC I PEI AEEGSTI SC W 
ERTRGALDYVHVFYTI SQIETDGINYLVDDFANASGTITFL PWQRSEVLNI YVLDDDI PELNEYFRV 
TLVSAI PGDGKLGSTPTSGAS IDPEKETTDITIKASDHPYGLLQFSTGLPPQPKDAMTLPASSVPHI 
TVEEEDGEIRLLVIRAQGLLGRVTAEFRTVSLTAFSPEDYQNVAGTLEFQPGERYKYIFINITDNSI 
PELEKSFKVELLNLEGGVAELFRVDGSGSASLGVASQILVTIAASDHAHGVFEFSPESLFVSGTEPE 
DGYSTVTLWIRHHGTLSPVTLHWNIDSDPIX5DLAFTSGNITFEIGQTSANITVEILPDEDPELDKA 
FSVSVLSVS SGSLGAHINATLTVLASDDPYG IFIFSEKNRPVKVEEATQNI TLSI IRLKGLMGKVLV 
SYATLDDMEKPPYFPPNLARATQGRDYI PASGFALFGANQSEAT IAIS ILDDDEPERSESVF IELLN 
STLVAKVQSRSIPNSPRLGPKVETIAQLII IAXTODAFGTLQLSAPIVRVAENHVGPI INVTRTGGAF 
ADVSVKFKAVTITAIAGEDYSIASSDVVLLEGETSKAVPIYVI^ 

LGALTEAVI I IEASDDPYGLFGFQ I TKL IVEEPEFNS VKVNLPI IRNSGTLGNVTVQ WVATINGQLA 
TGDLRWSGNVTFAPGETIQTLLLEVLADDVPEIEEVIQVQLTDASGGGTIGLDRIANI I IPANDDP 
YGTVAFAQMVYRVQEPLERSSCANITVRRSGGHFGRLLLFYSTSDIDVVALAMEEGQDLLSYYESPI 
QGVPDPLWRTWMNVSAVGEPLYTCATLCLKEQACSAFSFFSASEX5PQCFWMTSWISPAVNNSDFWTY 
RK^TRVASLFSGQAVAGSDYEPVTRQWAIMQEGDEFANLWSILPDDFPEMDESFLISLIiEVHLMN 
ISASLKNQPTIGQPNISTVVIALNGDAFGVFVIYNISPNTSEDGLFVEVQEQPQTLVELMIHRTGGS 
LGQVAVEWRWGGTATEGLDFIGAGEILTFAEGETKKTVILTILDDSEPEDDESIIVSLVYTEGGSR 
ILPS SDTVRWILAITONVAGIVSFQTASRSVIGHEGEILQFHVIRTFPGRGNVTVNW^ IGQNLELN 
F ANF SGQLFFPEGSLNTTLFVHLIjDDNI PEEKEVYQVILYDVRTQGVPPAG I ALLDAQGYAAVLTVE 
ASDEPHGVLNFALSSRFVLLQEANITIQLFrNREFGSLGAINVTYT 

DFVPIIGFLILEEGETAAAINITILEDDVPELEEYFLVNLTYVGLTMAASTSFPPRLGMRGFLFVSF 
CSLQMK 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 39B. 

5 



Table 39B. Comparison of NOV39a against NOV39b through NOV39d. 


Protein Sequence 


NOV39a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV39b 


1..2741 
1..2741 


2684/2741 (97%) 
2685/2741 (97%) 


NOV39c 


1..1456 
1..1456 


1442/1456(99%) 
1443/1456(99%) 


NOV39d 


1..2753 
1..2753 


2700/2753 (98%) 
2700/2753 (98%) 



Further analysis of the NOV39a protein yielded the following properties shown in 
10 Table 39C. 



Table 39C. Protein Sequence Properties NOV39a 
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PSort analysis: 


0.5050 probability located m cytoplasm; 0.3836 probabihtylocated in 
microbody (peroxisome); 0.1851 probability located in lysosome (lumen); 
0.1000 probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV39a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 39D. 



Table 39D. Geneseq Results for NOV39a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV39a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAE10925 


Human monogenic 

audiogenic 

seizure-susceptible-1 

(massl) protein - Homo 

sapiens, 2777 aa. 

[WO200165927-A1, 

13-SEP-2001] 


1..2753 
1..2777 


2736/2778 (98%) ' 
2739/2778 (98%) 


0.0 


AAE10924 


Mouse monogenic 
audiogenic 

ot>lZ*Lll v> UoltCs^/lI LUG X 

(massl) protein - Mus 
musculus, 2780 aa. 
[WO200165927-A1, 
13-SEP-2001] 


1..2739 
1..2761 


2295/2762 (83%) 
2516/2762(91%) 


0.0 


AAE10949 


Mouse massl protein 
mutant (7009deltaG) - Mus 
musculus, 2071 aa. 
[WO200165927-A1, 
13-SEP-2001] 


1..2049 
1..2071 


1710/2072(82%) 
1878/2072(90%) 


0.0 


ABG61545 


Human transporter and ion 
channel, TRICH15, Ihcyte 
E>7476089CD1-Homo 
sapiens, 759 aa. 
[WO200240541-A2, 
23-MAY-2002] 


1531..2288 
1..746 


740/758(97%) 
740/758 (97%) 


0.0 


ABB05663 


Human signal transduction 
protein clone amy2_10p7 - 
Homo sapiens, 1615 aa. 
[WO200198454-A2, 
27-DEC-2001] 


2232..2741 
9..518 


506/510(99%) 
507/510(99%) 


0.0 
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In a BLAST search of public sequence datbases, the NOV39a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 39E. 



Table 39E. Public BLASTP Results for NOV39a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV39a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q8WXG9 


Very large G protein-coupled 
receptor lb - Homo sapiens 
(Human), 6307 aa. 


1..2741 
180..2945 


2721/2766(98%) 
2723/2766(98%) 


0.0 


Q91ZS2 


MASS1 - Mus musculus 
(Mouse), 2780 aa. 


1..2739 
1..2761 


2293/2762 (83%) 
2515/2762(91%) 


0.0 


Q8VHN7 


Very large G protein-coupled 
receptor 1 - Mus musculus 
(Mouse), 6298 aa. 


1..2741 
179..2941 


2293/2764 (82%) 
2514/2764 (89%) 


0.0 


Q91ZS1 


MASS1.2 - Mus musculus 
(Mouse), 2238 aa. 


563..2739 
29..2219 


1838/2192 (83%) 
2004/2192 (90%) 


0.0 


Q8TF58 


KIAA1943 protein - Homo 
sapiens (Human), 1054 aa 
(fragment). 


234.. 1273 
1..1050 


1037/1050(98%) 
1037/1050 (98%) 


0.0 



Example 40. 

The NOV40 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 40A. 



Table 40A. NOV40 Sequence Analysis 




SEQIDNO: 167 


2833 bp 


r - 


NOV40a, 
CG15 1014-01 
DNA Sequence 


CAAAGATCCAGTTTGGAAATGAGAGAGGACTAGCATGACACATTGGCTCCACCATTGATATCTCCCA 


GAGGTACAGAAACAGGATTCATGAAGATGTTGACAAGACTGCAAGTTCTTACCTTAGCTTTGTTTTC 
AAAGGGATTTTTAC TCTC TTTAGGGGACC ATAACTTTC TAAGGAGAGAGATTAAAATAGAAGGTGAC 
CTTGTTTTAGGGGGCCTGTTTC C TATTAACGAAAAAGGCACTGGAACTGAAGAATGTGGGCGAATC A 
ATGAAGACCGAGGGATTCAACGCCTGGAAGCCATGTTGTTTGCTATTGATGAAATCAACAAAGATGA 
TTACTTGCTACCAGGAGTGAAGTTGGGTGTTCACATTTTGGATACATGTTCAAGGGATACCTATGCA 
TTGGAGCAATC^CTGGAGTTTGT(^GGGCATCTTTGACAAAAGTGK3ATGAAGCTGAGTATATGTGTC 
CTGATGGATCCTATGCCATTCAAGAAAACATCCCACTTCTCATTGCAGGGGTCATTGGTGGCTCTTA 
TAGCAGTGTTTCCATACAGGTGGCAAACCTGCTGCGGCTCTTCCAGATCCCTCAGATCAGCTACGCA 
TCCACCAGCGCCAAACTCAGTGATAAGTCGCGCTATGATTACTTTGCCAGGACCGTGCCCCCCGACT 
TCTACCAGGCCAAAGCCATGGCTGAGATCTTGCGCTTCTTC^CTGGACCTACGTGTCCACAGTAGC 
CTCCGAGGGTGATTACGGGGAGACAGGGATCGAGGCCTTCGAGCAGGAAGCCCGCCTGCGCAACATC 
TGCATCGCTACGGCGGAGAAGGTGGGCCGCTCCAACATCCGCAAGTCCTACGACAGCGTGATCCGAG 
AACTGTTGCAGAAGCCCAACGCGCGCGTCGTGGTCCTCTTCATGCGCAGCGACGACTCGCGGGAGCT 
CATTGCAGCCGCCAGCCGCGCCAATGCCTCCTTCACCTGGGTGGCCAGCGACGGCTGGGGCGCGCAG 
GAGAGCATCATCAAGGGCAGCGAGCATGTGGCCTACGGCGCCATCACCCTGGAGCTGGCCTCCCAGC 
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CTGTCCGCCAGTTCGACCGCTACTTCCAGAGCCTCA&cfe 

C CGGGACTTCTGGGAGCAAAAGTTTCAGTGCAGC C TCC AGAAC AAACGC AACCAC AGGCGCG TC TGC 
GACAAGCACCTGGCCATCGACAGCAGCAACTACGAGCAAGAGTCCAAGATCATGTTTGTGGTGAACG 
CGGTGTATGCCATGGCCCACGCTTTGCACAAAATGCAGCGCACCCTCTGTCCCAACACTACCAAGCT 
TTGTGATGC TATG AAGATC CTGGATGGGAAGAAGTTGTACAAGG ATTACTTGCTGAAAATC AAC TTC 
ACGGCTCCATTCAACCCAAATAAAGATGCAGATAGCATAGTCAAGTTTGACACTTTTGGAGATGGAA 
TGGGGCGATACAACGTGTTCAATTTCCAAAATGTAGGTGGAAAGTATTCCTACTTGAAAGTTGGTCA 
CTGGGCAGAAACCTTATCGCTAGATGTCAACTCTATCCACTGGTCCCGGAACTCAGTCCCCACTTCC 
CAGTGCAGCGACCCCTGTGCCCCCAATGAAATGAAGAATATGCAACCAGGGGATGTCTGCTGCTGGA 
TTTGC ATCCCCTGTGAACC CTACGAATAC C TGG CTGATG AGTTTAC C TGT ATGGATTGTGGGTC TGG 
ACAGTGGCCCACTGCAGACCTAACTGGATGCTATGACCTTCCTGAGGACTACATCAGGTGGGAAGAC 
GCCTGGGCCATTGGCCCAGTCACCATTGCCTGTCTGGGTTTTATGTGTACATGCATGGTTGTAACTG 
T TTTTATCAAGC AC AACAACAC ACCCTTGGTC AAAGC ATC GGGCCGAGAACTC TGC TAC ATC TTATT 
GTTTGGGGTTGGC CTGTCATAC TGC ATGAC ATTCTTC TTC ATTGCC AAG C CATCAC CAGTCATCTGT 
GCATTGCGCCGACTCGGGCTGGGGAGTTCCTTCGCTATCTGTTACTCAGCCCTGCTGACCAAGACAA 
ACTGCATTGCCCGCATC TTCGATGGGGTC AAGAATGGCGC TC AGAGGCC AAAATTCATCAGC CCC AG 
TTCTCAGGTTTTCATCTGCCTGGGTCTGATCCTGGTGCAAATTGTGATGGTGTCTGTGTGGCTCATC 
C TGGAGGCCCCAGGC AC C AGGAGGT ATACCC TTAC AGAGAAGCGGGAAACAGTC ATCCTAAAATGC A 
ATGTC AAAGATTCC AGCATGTTGATCTCTCTTACCTACGATGTGATC CTGGTGATCTTATGC AC TGT 
GTACGCC TTC AAAACGCGGAAGTGCCCAGAAAATTTCAACGAAGC TAAGTTC AT AGGTTTTACCATG 
TACACCACGTGCATCATCTGGTTGGCCTTCCTCCCTATATTTTATGTGACATCAAGTGACTACAGAC 
CTCTGCAAGCACGT ATGTGTCAACGGTGTGC AATGGGCGGGAAGTCC TCGACTCCACCACCTCATC T 
CTGTGATTGTGAATTGCAGTTCAGTTCTTGTGTTTTTAGACTGTTAGACAAAAGTGCTCACGTGCAG 
CTCCAGAATATGGAAACAGAGCAAAAGAACAACCCTAGTACCTTTTTTTAGAAACAGTACGATAAAT 
TATTTTTGAGGACTGTATATAGTGATGTGCTAGAACTTTCTAGGCTGAGTCTAGTGCCCCTATTATT 


AACAATTCCCCCAGAACATGGAAATAACCATTGTTTACAGAGCTGAGCATTGGTGACAGGGTCTGAC 


ATGGTCAGTCTAC TTC AAG 




ORF Start: ATG at 88 j 


ORF Stop: TAG at 2662 





SEQ ID NO: 168 \S58 aa 


MWat96975.6kD 


NOV40a, 
CG15 1014-01 
Protein Sequence 


MKMLTRLQVLTIiALF SKGFLL SLGDHNFLRREI KI EGDL VLGGLF P INEKGTGTEECGRINEDRGI Q 
RLEAMLFAIDEINKDDYLLPGVKIXSVHI^ 

QENIPLLIAGVTGGSySSVSIQVANLLRLFQIPQISYASTSAKLSDKSRYDYFARTVPPDFYQAKAM 
AE I LRFFNWTYVS TVASEGDYGETG IEAFEQEARLRNI C I ATAEKVGRSN I RKS YDSVTRELLQKPN 
ARVWLFMRSDDSRELIAAASRANASFTWVASDGWGAQES I IKGSEHVAYGAITLELASQPVRQFDR 
YFQSLNPYNNHRNPWFRDFWEQKFQCSLQNKRNHRRVCDKH^ 
ALHKMQRTLCPNTTKLCDAMKILTCKKLYKDYLL^^ 

NFQNVGGKYSYLKVGHWAETIjSLDVNS IHWSRNSVPTSQC SDPCAPNEMKNMQPGDVCCWIC I PCEP 
YEYLADEFTCMDCGSGQWPTADLTGC YDL PEDYIRWEDAWAI GPVT I ACLGFMCTCMWTVF IKHNN 
TPLVKASGRELC YILLFGVGLSYCMTFFFIAKPSPVICALRRLGLGSSFAICYSALLTKTNC IARI F 
DGVKNGAQRPKFISPSSQWICLGLILVQIVWSVV^ILEAPGTRRYTLTEKRETVILKCNVKDSSM 
LISLTYDVILVILCTVYAFKTRKCPENFNEAKFIGFTMYTTCIIWIiAFLPIFYVTSSDYRPLQARMC 
QRCAMGGKSSTPPPHLCDCELQFSSCVFRIiLDKSAHVQIiQNMETEQKNNPSTFF 





SEQ ID NO: 169 jl758bp | 


NOV40b, 
CG15 1014-02 
DNA Sequence 


CAAAGATCCAGTTTGGAAATGAGAGAGGACTAGCATGACACATTGGCTCCACCATTGATATCTCCCA 


GAGGTACAGAAACAGGATTCATGAAGATGTTGACAAGACTGCAAGTTCTTACCTTAGCTTTGTTTTC 
AAAGGGATTTTTACTCTCTTTAGGGGACCATAACTTTCTAAGGAGAGAGATTAAAATAGAAGGTGAC 
CTTGTTTTAGGGGGCCTGTTTCCTATTAACGAAAAAGGCACTGGAACTGAAGAATGTGGGCGAATCA 
ATGAAGACCGAGGGATTCAACGCCTGGAAGCCATGTTGTTTGCTATTGATGAAATCAACAAAGATGA 
TTACTTGCTACCAGGAGTGAAGTTGGGTGTTCACATTTTGGATACATGTTCAAGGGATACCTATGCA 
TTGGAGCAATCACTGGAGTTTGTCAGGGCATCTTTGACAAAAGTGGATGAAGCTGAGTATATGTGTC 
C TGATGGATCCTATGCC ATTCAAGAAAAC ATCCCACTTC TC ATTGC AGGGGTC ATTGGTGGCTCTTA 
TAGCAGTGTTTCCATACAGGTGGCAAACCTGCTGCGGCTCTTCCAGATCCCTCAGATCAGCTACGCA 
TCCACCAGCGCCAAACTCAGTGATAAGTCGCGCTATGATTACTTTGCCAGGACCGTGCCCCCCGACT 
TCTACCAGGCCAAAGCCATGGCTGAGATCTTGCGCTTCTTCAACTGGACCTACGTGTCCACAGTAGC 
C TCCGAGGGTGATTACGGGGAGAC AGGGATCGAGGCC TTCG AGC AGGAAGCCCGC CTGCGCAACATC 
TGCATCGCTACGGCGGAGAAGGTGGGCCGCTCCAACATCCGCAAGTCCTACGACAGCGTGATCCGAG 
AACTGTTGCAGAAGCCCAACGCGCGCGTCGTGGTCCTCTTCATGCGCAGCGACGACTCGCGGGAGCT 
C ATTGCAGCCGCCAGCCGCGC C AATGCCTCCTTCACC TGGGTGGCCAGC GACGGC TGGGGCGCGC AG 
GAGAGCATCATCAAGGGCAGCGAGCATGTGGCCTACGGCGCCATCACCCTGGAGCTGGCCTCCCAGC 
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CTGTCCGCCAGTTCGACCGCTACTTCCAGAGCCTCii!^^ 

GACAAGCACCTGGCCATCGACAGCAGCAACTACGAGCAAGAGTCCAAGATCATGTTTGTGGTGAACG 
CGGTGTATGCCATGGCCCACGCTTTGCACAAAATGCAGCGCACCCTCTGTCCCAACACTACCAAGCT 
TTGTGATGC TATGAAG ATC CTGG ATGGGAAGAAGTTGTAC AAGG ATTAC TTGCTGAAAATCAACTTC 
ACGGGTGCAGACGACAACCATGTGCATCTCCGTCAGCCTGAGTGGCTTTGTGGTCTTGGGCTGTTTG 
TTTGCACCCAAGGTTCACATCATCCTGTTTCAACCCCAGAAGAATGTTGTCACACACAGACTGCACC 
TC AACAGGTTC AGTGTC AGTGGAAC TGGG ACC ACATACTC TC AGTCC TC TGCAAGC ACGTATGTGC C 
AACGGTGTGCAATGGGCGGG AAGTCC TCGAC TC CACCACC TC ATC TC TGTGATTGTGAATTGC AGT T 
CAGTTCTTGTGTTTTTAGACTGTTAGACAAAAGTGCTCACGTGCAGCTCCAGAATATGGAAACAGAG 


CAAAAGAACAACCCTA 




ORF Start: ATG at 88 j \ORF Stop: TAG at 1699 





SEQ ID NO: 170 f537aa 


MWat 60801.8kD 


NOV40b, 
CG151014-02 
Protein Sequence 


mk^trlqvltlalfskgfllslgdhnflrreikiegdlvlggiifpinekgtgteecgrinedrgiq 
rleamlfaideiktcoddyllpgvklgvhil^^ 

qeni pll i agvi ggsyssvs iqvanllrlfqi pqi syastsaklsdksrydyfartvppdfyqakam 
aeilrffnwtwstvasegdygetgi eafeqearlrnic iataekvgrsnirksydsvirellqkpn 
arvvvlfmrsddsreliaaasranasftwvasdgwgaqes i ikgsehvaygaitlelasqpvrqfdr 
yfqslotynnhrnpwfrdfweqkfqcslqnkr 

alhkmqrtlc pnttklcdamk i lix5kkl ykdyllk inftgaddnhvhlrq pewlcglgl fvc tqgsh 
hpvstpeecchtqtapqqvqcqwnwdhiiisvlckhvcangvqwagsprlhhl i svivncssvlvfld 

C 





SEQ ID NO: 171 jl758bp 




NOV40c, 
CG151014-03 
DNA Sequence 


CCTTGATCCAGTTTGGAAATGAGAGAGGACTAGCATGACACATTGGCTCCACCATTGATATCTCCCA 


GAGGTACAGAAACAGGATTCATGAAGATGTTGACAAGACTGCAAGTTCTTACCTTAGCTTTGTTTTC 
AAAGGGATTTTTACTCTCTTTAGGGGACCATAACTTTCTAAGGAGAGAGATTAAAATAGAAGGTGAC 
CTTGTTTTAGGGGGCCTGTTTCCTATTAACGAAAAAGGCACTGGAACTGAAGAATGTGGGCGAATCA 
ATGAAGACCGAGGGATTCAACGCCTGGAAGCCATGTTGTTTGCTATTGATGAAATCAACAAAGATGA 
TTACTTGCTACCAGGAGTGAAGTTGGGTGTTCACATTTTGGATACATGTTCAAGGGATACCTATGCA 
T TGGAGC AATCACTGGAGTTTGTCAGGGC ATCTTTGAGAAAAGTGG ATGAAGC TGAGTATATGTGTC 
CTGATGGATCCTATGCCATTCAAGAAAACATCCCACTTCTCATTGCAGGGGTCATTGGTGGCTCTTA 
TAGCAGTGTTTCCATACAGGTGGCAAACCTGCTGCGGCTCTTCCAGATCCCTCAGATCAGCTACGCA 
TCCACCAGCGCCAAACTCAGTGATAAGTCGCGCTATGATTACTTTGCCAGGACCGTGCCCCCCGACT 
TCTACCAGGCCAAAGCCATGGCTGAGATCTTGCGCTTCTTCAACTGGACCTACGTGTCCACAGTAGC 
CTCCGAGGGTGATTACGGGGAGACAGGGATCGAGGCCTTCGAGCAGGAAGCCCGCCTGCGCAACATC 
TGCATCGCTACGGCGGAGAAGGTGGGCCGCTCCAACATCCGCAAGTCCTACGACAGCGTGATCCGAG 
AACTGTTGCAGAAGCCCAACGCGCGCGTCGTGGTCCTCTTCATGCGCAGCGACGACTCGCGGGAGCT 
CATTGCAGCCGCCAGCCGCGCCAATGCCTCCTTCACCTGGGTGGCCAGCGACGGCTGGGGCGCGCAG 
GAGAGCATCATC AAGGGC AGCGAGCATGTGGCCTACGGCGCCATCAC CCTGGAGCTGGCC TCCC AGC 
CTGTCCGCCAGTTCGACCGCTACTTCCAGAGCCTCAACCCCTACAACAACCACCGCAACCCCTGGTT 
CCGGGACTTCTGGGAGCAAAAGTTTCAGTGCAGCCTCCAGAACAAACGCAACCACAGGCGCGTCTGC 
GACAAGCACCTGGCCATCGACAGCAGCAACTACGAGCAAGAGTCCAAGATCATGTTTGTGGTGAACG 
CGGTGTATGCCATGGCCCACGC TTTGC AC AAAATGC AGC GCACCCTC TGTCCCAAC AC TAC CAAGCT 
TTGTGATGCTATGAAGATCCTGGATGGGAAGAAGTTGTACAAGGATTACTTGCTGAAAATCAACTTC 
ACGGGTGCAGACGACAACCATGTGCATCTCCGTCAGCCTGAGTGGCTTTGTGGTCTTGGGCTGTTTG 
TTTGC ACC CAAGGTTCACATC ATCC TGTTTCAACCCC AGAAGAATGTTGTCAC AC AC AGAC TGC ACC 
TC AACAGGTTC AGTGTCAGTGGAAC TGGGACC AC ATACTC TC AG TC C TC TGCAAGC ACGTATGTGCC 
AACGGTGTGCAATGGGCGGGAAGTCCTCGACTCCACCACCTCATCTCTGTGATTGTGAATTGCAGTT 
C AGTTCTTGTGTTTT TAGAC TGTTAGAC AAAAGTG CTC ACGTGC AGC TCC AGAATATGGAAACAGAG 


CAAAAGAACAACCCTA 




ORF Start: ATG at 88 | 


ORF Stop: TAG at 1699 
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SEQ ID NO: 172 |537 aa JMW at 60801.8kD 


NOV40c, 
CG151014-03 
Protein Sequence 


MKMLTRLQVLTIJtf^FSKGFLLSLGDHOT^ 

RLEAl^FAIDEINKDDYLLPGVKLGVHILDTCSRDTYALEQSLEFTOASLTKVDEAEYMCPDGSYAI 

QENIPLL IAGVIGGSYS SVS IQVANLLRLFQIPQI S YASTSAKLSDKSRYDYFARTVPPDFYQAKAM 

AEILRFFNWTYVSTVASEGDYGETGIEAFEQEARLRNICIATAEKVGRSNIRKSYDSVIRELLQKPN 

ARVWLFMRSDDSRELIAAASRANASFTWVASDGWGAQES I IKGSEHVAYGAITLELASQPVRQFDR 

YFQSLNPYNNHRNPWFRDFWEQKFQCSLQNKRNH^ 

ALHKMQRTI/:PNTTKLCDAMKIIJX3K^ 

HPVSTPEECCHTQTAPQQVQCQWNWDHILSVLCKHVCANGVQWAGSPRLHHLISVIVNCSSVLVFLD 
C 



5 

Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 40B. 



Table 40B. Comparison of NOV40a against NOV40b and NOV40c. 


Protein Sequence 


NOV40a Residues/ 


Identities/ 


Match Residues 


Similarities for the Matched Region 


NOV40b 


1..441 


409/441 (92%) 




1..441 


409/441 (92%) 


NOV40c 


1..441 


409/441 (92%) 




1..441 


409/441 (92%) 
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Further analysis of the NOV40a protein yielded the following properties shown in 
Table 40C. 

15 



Table 40C. Protein Sequence Properties NOV40a 


PSort analysis: 


0.6400 probability located in plasma membrane; 0.4600 probability located in 
Golgi body; 0.3700 probability located in endoplasmic reticulum (membrane); 
0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 25 and 26 



A search of the NOV40a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
20 several homologous proteins shown in Table 40D. 



Table 40D. Geneseq Results for NOV40a 
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Ge'neseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV40a P 
Residues/ 
Match 
Residues 


CT/U30B 
Identities/ 

Similarities for 

the Matched 

Region 


Expect 
Value 


AAE15990 


Human glutamate receptor, 
metabotrophic 3 (GRM3) 
protein - Homo sapiens, 877 
aa. [WO200196350-A2, 
zU-DbC-zUOlJ 


3..811 
1..809 


797/809 (98%) 
799/809 (98%) 


0.0 


AAR82657 


Human mGluR3 - Homo 
sapiens, 877 aa. 
[WO9522609-A2, 
24-AUG-1995] 


3..811 
1..809 


797/809 (98%) 
799/809 (98%) 


0.0 


AAM23698 


Human EST encoded protein 
SEQIDNO: 1223 -Homo 
sapiens, 857 aa. 
[WO200154477-A2, 
02-AUG-2001] 


1..811 
1..811 


796/811 (98%) 
798/811 (98%) 


0.0 


AAR64252 


Human mGluR3 - Homo 
sapiens, 879 aa. 
[W09429449-A, 
22-DEC-1994] 


1..811 
1-811 


796/811 (98%) 
799/811 (98%) 


0.0 


AAO15105 


Human 

ph2SPMGluR3-CaR*AAA* 
Gqi5 fusion construct protein 
sequence - Chimeric - Homo 
sapiens, 1402 aa. 
[WO200229033-A2, 
ll-APR-2002] 


21-811 
17..807 


777/791 (98%) 
781/791 (98%) 


0.0 



In a BLAST search of public sequence datbases, the NOV40a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 40E. 



Table 40E.Pul 


Mic BLASTP Results for NOV40a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV40a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q14832 


Metabotropic glutamate 
receptor 3 precursor 
(mGluR3) - Homo sapiens 
(Human), 877 aa. 


3..811 
1..809 


797/809(98%) 
799/809 (98%) 


0.0 
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Q8TBH9 


Glutamate receptor, 
metabotropic 3 - Homo 
sapiens (Human), 877 aa. 


p 

3..811 
L.809 


797/809 (98%) 




Q9QYS2 


Metabotropic glutamate 
receptor 3 protein - Mus 
musculus (Mouse), 879 aa. 


1..811 
1.-811 


773/811 (95%) 
792/811(97%) 


0.0 


P31422 


Metabotropic glutamate 
receptor 3 precursor - Rattus 
norvegicus (Rat), 879 aa. 


1..811 
1..811 


772/811 (95%) 
790/811 (97%) 


0.0 


JC7160 


metabotropic glutamate 
receptor subtype 3 precursor - 
mouse, 879 aa. 


1..811 
1-811 


771/811(95%) 
790/811 (97%) 


0.0 



PFam analysis predicts that the NOV40a protein contains the domains shown in the 
Table 40F. 

5 



Table 40F. Domain Analysis of NOV40a 


Pfam Domain 


NO V40a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


ANF_receptor 


58..489 


194/473 (41%) 
399/473 (84%) 


3.2e-173 


7tm_3 


576..820 


109/283 (39%) 
217/283 (77%) 


3.1e-104 



Example 41. 

10 The NOV41 clone was analyzed, and the nucleotide and encoded polypeptide 

sequences are shown in Table 41A. 



Table 41 A- NOV41 Sequence Analysis 




SEQIDNO: 173 


880 bp 




NOV41a, 
CG151297-01 
DNA Sequence 


GAATTCTGATGTGCTTCAGTGCACAGAACAGTAACAGATGAGCTGCTTTTGGGGAGAGCTTGAGTAC 


TCAGTCGGAGCATCATCATGGGGTCTAGTGCCACAGAGATTGAAGAATTGGAAAACACCACTTTTAA 
GTATCTTACAGGAGAACAGACTGAAAAAATGTGGCAGCGCCTGAAAGGAATACTAAGATGCTTGGTG 
AAGCAGCTGGAAAGAGGTGATGTTAACGTCGTCGACTTAAAGAAGAATATTGAATATGCGGCATCTG 
TGCTGGAAGCAGTTTATATCGATGAAACAAGAAGACTTCTGGATACTGAAGATGAGCTCAGTGACAT 
TCAGACTGACTCAGTCCCATCTGAAGTCCGGGACTGGTTGGCTTCTACCTTTACACGGAAAATGGGG 
ATGACAAAAAAGAAACCTGAGGAAAAACCAAAATTTCGGAGCATTGTGCATGCTGTTCAAGCTGGAA 
TTTTTGTGGAAAGAATGTACCGAAAAACATTTTCTCTTCTGACAGACTCAACAGAGAAAATTGTTAT 
TCCTCTTATAGAGGAAGCCTC^AAAGCCGAAACTTCTTCCTATGTGGCAAGC^GCTC^CCACCATT 
GTGGGGTTACACATTGCTGATGCACTAAGACGATCAAATACAAAAGGCTCCATGAGTGATGGGTCCT 
ATTCCCCAGACTACTCCCTTGCAGCAGTGGACCTGAAGAGTTTCAAGAACAACCTGGTGGACATCAT 
TCAGCAGAACAAAGAGAGGTGGAAAGAGTTAGCTGCACAAGAAGC^GAACCAGTTCACAGAAGTGT 
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jGAGTTTATTCATCAGTA AACACCTTTAAGTAAAAC 3 
jAAGACTTGG 

jQRF Start: ATG at 85 [ [ORF Stop: TAA at 820 





SEQ ED NO: 174 ]245 aa 


MW at 27787.2kD 


NOV41a, 
CG151297-01 
Protein Sequence 


MGSSATEIEELENTTFKYLTGEQTEKl^QRLKGILRCLVKQLERGDVNVVDLKKNIEYAASVLEAW 
I DETRRLLDTEDELSDIQTDSVPSEVRDWLASTFTRKMGMTKKKPEEKPKFRSI VHAVQAG I FVERM 
YRKTFSLLTDSTEKIVIPLIEEASKAETSSYVASSSTTIVGLHIADALRRSNTKGSMSDGSYSPDYS 
L AAVDLKSFKNNLVDI IQQNKERWKEIiAAQEARTSSQKCEFIHQ 





SEQ ID NO: 175 


1817 bp | 


NOV41b, 
CG151297-02 
DNA Sequence 


TCAGTGC AC AGAACAGT AAC AGATGAGC TGCTTTTGGGGAGAGC TTGAGTACTCAGTC GGTC AGTAG 


TACAGTAGC AGGC TCAC ATGT ACGGAT TGTTCTTGTGAGG AGCATC ATC AT GGGGTCT AGTGCCACA 


G AGATTGAAGAAT TGGAAAACACC AC T TTTAAGTATC TT AC AGGAG AAC AG ACTGAAAAAATGTGGC 
AGCGCCTGAAAGGAATACTAAGATGCTTGGTGAAGCAGCTGGAAAGAGGTGATGTTAACGTCGTCGA 
CTTAAAGAAGAATATTGAATATGCGGCATCTGTGCTGGAAGCAGTTTATATCGATGAAACAAGAAGA 
CTTCTGGATACTGAAGATGAGCTCAGTGACATTCAGACTGACTCAGTCCCATCTGAAGTCCGGGACT 
GGTTGGCTTCTACCTTTACACGGAAAATGGGGATGACAAAAAAGAAACCTGAGGAAAAACCAAAATT 
TCGGAGCATTGTGCATGCTGTTCAAGCTGGAATTTTTGTGGAAAGAATGTACCGAAAAACATATCAT 
ATGGTTGGTTTGGCATATCCAGCAGCTGTCATCGTAACATTAAAGGATGTTGATAAATGGTCTTTCG 
ATGTATTTGCCCTAAATGAAGCAAGTGGAGAGCATAGTCTGAAGTTTATGATTTATGAACTGTTTAC 
C AGATATGATCTTATC AACCGTTTCAAGATTCC TGTT TCTTGCC TAATCAC C TT TGCAGAAGCTTTA 
GAAGTTGGTTACGGCAAGTACAAAAATCCATATCACAATTTGATTCATGCAGCTGATGTCACTCAAA 
CTGTGCATTACATAATGCTTCATACAGGTATCATGCACTGGCTCACTGAACTGGAAATTTTAGCAAT 
GGTCTTTGCTGCTGCCATTCATGATTATGAGCATACAGGGACAACAAACAACTTTCACATTCAGACA 
AGGTCAGATG TTGCC ATTT TGTAT AATGATCGCTCTGTCCTTGAGAATC ACC ACGTGAGTGCAGCTT 
ATCGACTTATGCAAGAAGAAGAAATGAATATCTTGATAAATTTATCCAAAGATGACTGGAGGGATCT 
TCGGAACCTAGTGATTGAAATGGTTTTATCTACAGACATGTCAGGTCACTTCCAGCAAATTAAAAAT 
ATAAGAAACAGTTTGCAGCAGCCTGAAGGGATTGACAGAGCCAAAACCATGTCCCTGATTCTCCACG 
C AGCAGAC ATC AGCCACCCAGCC AAATCC TGGAAGC TGC ATTATCGGTGGACCATGGC CC TAATGGA 
GGAGTTTTTCCTGCAGGGAGATAAAGAAGCTGAATTAGGGCTTCCATTTTCCCCACTTTGTGATCGG 
AAGTCAACCATGGTGGCCCAGTCACAAATAGGTTTCATCGATTTCATAGTAGAGCCAACATTTTCTC 
TTCTGACAGACTCAACAGAGAAAATTGTTATTCCTCTTATAGAGGAAGCCTCAAAAGCCGAAACTTC 
TTCCTATGTGGCAAGCAGCTCAACCACCATTGTGGGGTTACACATTGCTGATGCACTAAGACGATCA 
AATAC AAAAGGCTCCATGAGTGATGGGTCCTATTCC CCAGACTAC TC CCTTGC AGCAGTGGACCTGA 
AGAGTTTCAAGAACAACCTGGTGGACATCATTCAGCAGAACAAAGAGAGGTGGAAAGAGTTAGTTGC 
ACAAGAAGCAAGAACCAGTTCACAGAAGTGTGAGTTTATTCATCAGTAAACACCTTTAAGTAAAACC 
TCGTGCATGGTGGCAGCTCTAATTTGACCAAAAGACTTGGAGATTTTGATTATGCTTGCTGGATATC 




TATTCTGT 




ORF Start: ATG at 117 | 


jORF Stop: TAA at 1722 





SEQ ID NO: 176 |535 aa 


MW at 61249.3kD 


NOV41b, 
CG151297-02 
Protein Sequence 


MGSSATEIEELENTTFKYLTGEQTEKMWQRLKGILRCLVKQLERGDVNVVDLKKNIEYAASVLEAVY 
IDETRRLLDTEDELSDIQTDSVPSEVRDWI^STFTRKMGMTKKKPEEKPKFRSIVHAVQAGIFVERM 
YRKTYHWGLAYPAAVIVTLKDVDKWSFDVFALNEASGEHSLKFMIYELFTRYDLINRFKIPVSCLI 
TFAEALEVGYGKYKNPYHNLIHAADVTQTVHYIMLHTC 

OTHIQTRSDVAILYlTORSVLENHHVSAAYRIiMQEEEMNILINLSKDDWRDLRm 
FQQIKNIRNSLQQPEGIDRAKTMSLILHAADISHPAKSWKLHYRW^ 

SPLCDRKSTMVAQSQIGFIDFIVEPTFSLLTDSTEKIVIPLIEEASKAETSSYVASSSTTIVGLHIA 
DALRRSNTKGSMSDGSYSPDYSLAAVDLKSFRNI^ 
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Sequence comparison of the above protein sequerifceiryifelds Ifie-fdMdwitig^fe^ueftde * 
relationships shown in Table 41B. 



Table 41B. Comparison of NOV41a against NOV41b. 


Protein Sequence 


NOV41a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV41b 


1..159 
1..159 


141/159 (88%) 
148/159 (92%) 



5 



Further analysis of the NOV41 a protein yielded the following properties shown in 
Table 41C. 
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Table 41 C. Protein Sequence Properties NOV41a 


PSort analysis: 


0.8800 probability located in nucleus; 0.1000 probability located in 
mitochondrial matrix space; 0.1000 probability located in lysosome (lumen); 
0.1000 probability located in plasma membrane 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV41a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
15 several homologous proteins shown in Table 4 ID. 



Table 41D. Geneseq Results for NOV41a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV41a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAB85116 


Human 3', 5' cyclic 
nucleotide phosphodiesterase 
(HSPDElA3A)-Homo 
sapiens, 535 aa. 
[EP1097707-A1, 
09-MAY-2001] 


1..159 
L.159 


141/159 (88%) 
148/159 (92%) 


5e-75 


AAB85105 


Human 3\ 5' cyclic 
nucleotide phosphodiesterase 
(HSPDElA3A)-Homo 
sapiens. 535 aa. 


1..159 
1..159 


141/159 (88%) 
148/159 (92%) 


5e-75 
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[EP1097706-A1, 
09-MAY-2001] 


ip* 




uiitfc «JLf MB<k .1^ 


AAE07953 


Human phosphodiesterase 
(PDE) type 1 protein - Homo 
sapiens, 535 aa. 
[EP1097719-A1, 
09-MAY-2001] 


1..159 
1..159 


141/159 (88%) 
148/159 (92%) 


5e-75 


AAE07917 


Human phosphodiesterase 
(PDE) type 1 protein - Homo 
sapiens, 535 aa. 
[EP1097718-A1, 
09-MAY-2001] 


1..159 
1..159 


141/159 (88%) 
148/159 (92%) 


5e-75 


AAY80988 


Human 61 kD CaM-PDE 
(clone pHcam61-6N-7), SEQ 
ID NO:49 - Homo sapiens, 
535 aa. [US6015677-A, 
18-JAN-2000] 


1..159 
1.159 


141/159 (88%) 
148/159 (92%) 


5e-75 


In a BLAST search of public sequence datbases, the NOV41a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 41E. 


Table 41E. Public BLASTP Results for NOV41a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV41a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


AAH22480 


Hypothetical 62.3 kDa protein - 
Homo sapiens (Human), 545 
aa. 


1.159 
1..159 


141/159 (88%) 
148/159 (92%) 


le-74 


P54750 


Calcium/calmodubn-dependent 
3',5'-cyclic nucleotide 
phosphodiesterase 1A (EC 
3.1.4.17) (Cam-PDE 1A) (61 
kDa Cam-PDE) (hCam-1) - 
Homo sapiens (Human), 534 
aa. 


, 2.. 159 
1..158 


140/158 (88%) 
147/158 (92%) 


6e-74 


Q9EPR9 | 


Phosphodiesterase 1A - Rattus 
norvegicus (Rat), 542 aa. 


1-159 
1..159 


134/159 (84%) 
144/159 (90%) 


6e-71 


Q61481 


Calcium/calmodulin-dependent 
3',5'-cyclic nucleotide 
phosphodiesterase 1A (EC 
3.1.4.17) (Cam-PDE 1A) (61 
kDa Cam-PDE) - Mus 
musculus (Mouse), 565 aa. 


1-159 
21..179 


133/159(83%) 
143/159 (89%) 


3e-70 
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A45334 



3',5'-cyclic-nucleotide 
phosphodiesterase (EC 
3.1.4.17) 1A, 

calmodulin-dependent, 6 IK 
brain form - bovine, 530 aa. 



1..159 
1..159 



144/159(90%) 



6e 



PFam analysis predicts that the NOV41a protein contains the domains shown in the 
Table 41F. 



Table 41F. Domain Analysis of NOV41a 


Pf am Domain 


NOV41a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


PDEase 


138..159 


9/49(18%) 
22/49(45%) 


0.11 



Example 42. 

The NOV42 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 42A. 



Table 42A. NOV42 Sequence Analysis 




SEQIDNO:177 |512bp \ 


NOV42a, 
CG151822-01 


^ATGGCGGGCTGCGCGGCGCGGGCTCCGCCGGGCTCTGAGGCGCGTCTCAGCCTCGCCACCTTCCT 
GCTGGGCGCCTCGGTGCTCGCGCTGCCGCTGCTCACGCGCGCCGGCCTGCAGGGCCGCACCGGGCTG 


DNA Sequence 


CCATCCGAGCTTGTTTCCTGGGGTTTGTGTTCGGCTG 

TTGGAGTCACTTTGGCTQAACTGAAGCAGATTACCTGGCTCAGTGTCACAGGGCTGCTGATGGTfin'P 

CTTCGGAGAATGTCTGAGGAAGGCGGCCATGTNTACAGCTGGCTCCAATTTCAA^ 

MTGAAAAATCAGATACACATACTCTGGTGACCAGTGGAGTGTACGCTTGGTTTCRRrATPPTTr'P'P 




ACGTCGGGTGGTTTTACTCSGAGTATTGGAACTCAGGTGATfir:'? 

ORF Start: ATG at 3 J JORF Stop: TGA at 285 





SEQ ID NO: 178 |94 aa |mW at 9871.5kD 


NOV42a, 
CG151822-01 
Protein Sequence 


MAGCAARAPPGSEARLSLATFLLGASVLALPLLTRAGLQGRTGI^ 

IRACFLGFVFGCGTLLSFSQSSWSHFG W 
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NOV42b, 
CG151822-02 
DNA Sequence 



SEQIDNO: 179 



3597 bp 



GGCACGAGCGGCGCCGCCGCCCGCTAGTCCGCCGCCCGGCGCCA TGGCGGGCTGCGCGGCGCGGnPT 



CCGCCGGGCTCTGAGGCGCGTCTCAGCCTCGCCACCTTCCTGCTGGGCGCCTCGGTGCTCGCGCTGC 
CGCTGCTCACGCGCGCCGGCCTGCAGGGCCGCACCGGGCTGGCGCTCTACGTGGCCGGGCTCAACGC 
GCTGCTGCTGCTGCTCTATCGGCCGCCTCGCTACCAGATAGCCATCCGAGCTTGTTTCCTGGGGTTT 
GTGTTCGGCTGCGGCACGCTGCTAAGTTTTAGCCAGTCTTCTTGGAGTCACTTTGGCTGGTACATGT 
GCTCCCTGTCATTGTTCCACTATTCTGAATACTTGGTGACAGCAGTCAATAATCCCAAAAGTCTGTC 
CTTGGATTCCTTTCTCCTGAATCACAGCCTGGAGTATACAGTAGCTGCTCTTTCTTCTTGGTTAGAG 
TTC ACAC TTGAAAATATC TTTTGGCC AGAACTGAAGCAGATTACC TGGC TC AGTGTC AC AGGGC TGC 
TGATGGTGGTCTTCGGAGAATGTCTGAGGAAGGCGGCCATGTTTACAGCTGGCTCCAATTTCAACCA 
CGTGGTACAGAATGAAAAATCAGATACACATACTCTGGTGACCAGTGGAGTGTACGCTTGGTTTCGG 
CATCCTTCTTACGTCGGGTGGTTTTACTGGAGTATTGGAACTCAGGTGATGCTGTGTAACCCCATCT 
GCGGCGTC AGC TATGCCCTGACAGTGTGGCGATTCT TCCGCGATCGAACAGAAGAAGAAGAAATC TC 
ACTAATTCACTTTTTTGGAGAGGAGTACCTGGAGTATAAG71AGAGGGTGCCCACGGGCCTGCCTTTC 
ATAAAGGGGGTCAAGGTGGACCTGTG ACGGGCAGTGGCCCCGGTGACCTTGGGGCCTCCGACCCTGT 



GCAGCCTGGGACAAAACTGTTTCCGGTTGGCCGCTGCCACATGGATTTTCTTAATCGTTTTATGTCA 



TTAGTC AC TCTTCTGGAATGTC AC TC AAGACC AAGC GGTC AGAAGGCCTG AGGACCC AAGGCCCC AC 



TGGAGCAGTCTGTCCTTATGCCGAATCAAGGCGGAACATGGGTGAAAGACGAGTAAGGGGCAAATCA 



CAGCAATATTCCACAGCGCCCTCCAGAGTTACCTGGGGAGGACCGAGGCCACACGCCACTGCCCCCG 



AGGCC AGAGTGTAAGTAAAGGATAACC AGGAC TCGC TGGG AG AGATGG AC TC TGTCC TCAGC AAC AC 



TCCACAGCAGAAAGGGGTAGCAGGTACCCCTTCTTATCAGCGGTAAAAATGCATTTACAACCTTTCA 



TTTAACCGAAAAACACAGACCGCTTTAACCTCTTTATTTCTGTCCCCCACTGCATGAACATCTATAC 



AATTTTAAAAATAC TTCCTC ATAGGATGC TTTGGCC CTTC ATC TATTTAATC ATAGC TAC ATACCTA 



TTTTTT TAT AAGTAGC AGTAC AC ATTC AAAGGGGTATTCC TAGCTC AATGC TTGGTGTTCTAGTTCA 



ACTTTTATCCTGCAGCAAGTAAGCCTAGATAACTCTACACGATTTGGCTGAGTGGCTTTGTGTGACC 



GTGGCCCCAGGCCAAGGGGACCATGGCCCTGGC TGGC TTTCCCCCGGGGGTCTC AGC TCCTGTTGTC 



AGTGATAGGCGGCTCAAAGGAGCATCAGTTTCTTTTGATCCAAGAAGTGCTTACTGAATGCCTGCCC 



TGTGCGTGGCCTTAAACATTGAGAAGTGCTGCTCTCCGTTTATTTGGGATTTGATTCTCATTTTACC 



ATAGCTTATATTCTCAATTTCAATGCCAGTCTCAGAACTCTTGTTTTCTGTGTTCTGTTCTCAAAAT 



TACATTGTCCCTCATGTCATTTCAAACTGTTTTCCAAAGGGATTTGAGCATATACAACTACAAATCC 



AAGCAGATTGACTCTCAAAAATAATCTTAAATACTGCAAATAGTCCCAACTAAGATTCAGTCAGTAT 



GTTTGTTTTGCAAGTTTGGGAGAGTAAGTTGGCTTTGAGTCACACATCGAAGCTTTAAGAGGTGAGA 



CGCTGGCTTCATTCTGGACTAGACAGGAACTTGGCCTCAGCGTGAGATCCTGCCATGCAGTGTTGCG 



GTGGCACTGAAGAAGTGTGAATGTGAAGGCGGCGTCGGCGCGGGGCCAGAGCACCACTCTGCTGCCC 



CACCACGCGGCCTGTGAGGAGCCACTAAACCTTTCCGTGCCTAGACCTCCCCATCTGTGGAATGGGG 



TCAATACCACCTACCTCACAGGGGTGTTGTGAGGACTGAGAAGAACAATGTCAAATGTTTTTAATAC 



TC AGATGTGGGAGCGAC ATC AATGAAATC TGT ACTGTATGAAAGC TAC AC AAAAATGGGCAG ACATT 



TGGTTAATTGTGCCAGATACCTAAAATGTATGTTCAGAAAAGCATTTTATCi\ACTCAGAAATATGAC 



TTATTTC TAGATTC ATGGC TTAATG AATTTTTTC ATTGTTATATATACCAAAGAGGC TTACGGGTTC 



AT TG ATTGGTTTGAAAACC AGAC AGACGGC CGGGC AC GCC TGTAATCCCAAAGTGC TGGGATTGC AG 



CGTGAGCCACCACGCCCAGCCAAGATGAACTCCTTAAGGACAGGATTTGGTAAGTGATTGACTTCTT 



TTTAGTTCCATGATCTTGAGATTATTTTTAGCTTTATAAATTTAGCAGTGGCAGGGCCCGTGGAGAA 



TCAGGTTAATGAGGTAAAGGCTTTCTGGGTATTTGCTGCCAAGGCCACATCACCAATTTTCTCGATT 



TAAAAAACTGTCAAGAGATTTATTTTTCCATTGCAGGTTTTAAAGTGGAGATTCTGAAGTGGAAAAT 



AGGTAC TGTC AGAACAAAGCTACC TGGAAAC AGC ATAGAGTGAAGCC TTTCGTGAGGGCTTGC AGGC 



CGCTGC TGAGTGGC AGTTTAC AG AAGAGGTCGCGGGGTGAGCCTC TTAGC AGGAC AGAAAACAAGGC 



AGCAGCGCACCTGCCACCCCTTCACGAGCTGCTCCTTGAGCCTAAAAAGTAGGCTTTATTCATCCCT 



TC TGTTC ATTTACC AAC C TGGGGGATTG ATACGACCGGGGAAAATGTTCCTAAAC CAGG AAGC TGCG 



TTAGCCGATCAGGCTTTGTAAGATCTCGCCAACAGCTAGCTGCTTAGGAGTACCCCCACGATACGCA 



CAGCACACCACTGTCCCTTCACTGCACTTTCTTCCTGCCTTAGGTAGTTGGGCTTGCCCACCCTAGT 



TTGCTTTTGTAGTGGTTTGGCAAGGTTAGAAGGCCTCGGCCTCTCTGTCATGCTGGGAAGTGCCTAC 



TCTCTGGGCCACTGCTGCAGAGGCCGTGGCACTTGTCATGGGTTTGGAAGACCCAGCCATCTGCAGC 



AGAGGCAGCCTATCCCATTGCAAGGAGAGGAACTGAACGGAGTAATTATTCTACTCTTCTTTTTACA 



TAAATGTTTATTTAAATATTCTAAATTGGATTTTCATTCACAGATACTGATTATTCTTTCCAGTTCT 



TAAATAAAAC TGCACTTG ATTTC ACTCAAAAAAAAAAAAAAAAAAA 



ORF Start: ATG at 44 



ORFStop: TGA at 896 





SEQIDNO: 180 


|284 aa 


MWat31937.7kD 


NOV42b, 
CG151822-02 
Protein Sequence 


MAGCAARAPPGSEARLSLATFLLGASVLALPLtiTRAGLQGRTGLALYVAGLNALLLLLYRPPRYQ^ 
IRACFLGFVPGCGTLLSFSQSSWSHFGWYMCSLSLFHYSEYLVTAVNNPKSLSLDSFLLNHSLEYTV 
AALSSWLEFTLENIFWPELKQITWLSVTGLLMV^ 

SGVYAWFRHPSYVGWFYWS IGTQVMLCNPICGVS YALTVWRFFRDRTEEEEI SLIHFFGEEYLEYKK 
RVPTGL PF IKGVKVDL 
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Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 42B. 



5 



Table 42B. Comparison of NOV42a against NOV42b. 


Protein Sequence 


NOV42a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV42b 


1.94 
1..94 


67/94 (71%) 
67/94 (71%) 



Further analysis of the NOV42a protein yielded the following properties shown in 
Table 42C. 

10 



Table 42C. Protein Sequence Properties NOV42a 


PSort analysis: 


0.6000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3174 probability located in mitochondrial intermembrane space; 
0.3000 probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 37 and 38 



A search of the NOV42a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 42D. 



Table 42D. Geneseq Results for NOV42a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV42a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAY32299 


Farnesyl-directed cysteine 
carboxymethyltransferase 
STE14 - Homo sapiens, 284 
aa. [W09955878-A1, 
04-NOV-1999] 


1..94 
L.94 


94/94(100%) 
94/94(100%) 


2e-48 


AAW67730 


Human prenylcysteine 
carboxyl methyltransferase - 
Hnmn sanip.ns. 7.84 aa. 


L.94 
1.94 


94/94(100%) 
94/94(100%) 


2e-48 
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t 


[W09856924-A1, 
17-DEC-1998] 


m 


* II LS ias UJ fc. . 




AAB32052 

! 

1 
1 


Human secreted protein 
BLAST search protein SEQ 
ID NO: 110 - Homo sapiens, 
223 aa. [WO200058350-A1, 
05-OCT-2000] 


12..94 
1..83 


83/83 (100%) 
83/83 (100%) 


3e-41 


AAB32051 

\ 


Human secreted protein 
BLAST search protein SEQ 
ID NO: 109 - Homo sapiens, 
223 aa. [WO200058350-A1, 
05-OCT-2000} 


12..94 
1..83 


83/83 (100%) 
83/83 (100%) 


3e-41 


AAY32JUU 

1 

j 


Mouse farnesyl -directed 
cysteine 

carboxymethyltransferase - 
Mus musculus, 153 aa. 
[W09955878-A1, 
J04-NOV-1999] 


< OA 

4..93 


so /on foi ol\ 
83/90 (92%) 


ACk 

ze-4U 


In a BLAST search of public sequence datbases, the NOV42a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 42E. 


Table 42E. Public BLASTP Results for NO V42a 


Protein 

Accession j 
Number 


Protein/Organism/Length 


NOV42a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


060725 


Protein-S isoprenylcysteine 
O-methyltransferase (EC 
2.1.1.100) (Isoprenylcysteine 
carboxylmethyltransferase) 
(Prenylcysteine carboxyl 
methyltransferase) (pcCMT) 
(Prenylated protein carboxyl 
methyltransferase) (PPMT) - 
Homo sapiens (Human), 284 
aa. 


1..94 
1..94 


94/94 (100%) 
94/94(100%) 


5e-48 
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Q9SQK7 


Protein-S isoprenylcysteine 
O-methyltransferase (EC 
2.1.1.100) (Isoprenylcysteine 
carboxylmethyltransferase) 
(Prenylcysteine carboxyl 
methyltransferase) (pcCMT) 
(Prenylated protein carboxyl 
methyltransferase) (PPMT) - 
Mus musculus (Mouse), 283 
aa. 


— " u 

5..94 
4..93 


» / o a U IKl . 
84/90(93%) 

85/90(94%) 


* —Jl . 

2e-41 


012947 


Protein-S isoprenylcysteine 
O-methyltransferase (EC 
2.1.1.100) (Isoprenylcysteine 
carboxylmethyltransferase) 
(Prenylcysteine carboxyl 
methyltransferase) (pcCMT) 
(Prenylated protein carboxyl 
methyltransferase) (PPMT) 
(Farnesyl cysteine carboxyl 
methyltransferase) (FCMT) - 
Xehopus laevis (African 
clawed frog), 288 aa. 


13..94 
9..98 


49/90 (54%) 
59/90 (65%) 


2e-19 


Q9WVM4 


Protein-S isoprenylcysteine 
O-methyltransferase (EC 
2.1.1.100) (Isoprenylcysteine 
carboxylmethyltransferase) 
(Prenylcysteine carboxyl 
methyltransferase) (pcCMT) 
(Prenylated protein carboxyl 
methyltransferase) (PPMT) 
(Farnesyl cysteine carboxyl 
methyltransferase) (FCMT) - 
Rattus norvegicus (Rat), 232 
aa (fragment). 


53..94 
1..42 


39/42 (92%) 
40/42 (94%) 


8e-17 


Q9R1L8 


Farnesyl cysteine carboxyl 
methyltransferase - Rattus 
norvegicus (Rat), 33 aa 
(fragment). 


65..94 
1..30 


28/30(93%) 
29/30(96%) 


4e-10 



Example 43. 

The NOV43 clone was analyzed, and the nucleotide and encoded polypeptide 
5 sequences are shown in Table 43A. 



Table 43A. NOV43 Sequence Analysis 



ISEQIDNO: 181 



2306 bp 



NOV43a, 



jGCCi 



€CAT6GCGTCCTGCGTGGGGAGCCGGACCCTAAGCAAGGATGATGTGAACTACAAAATGCATTTCC 
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CG152256-01 
DNA Sequence 



GGATGATCAACGAGCAGCAAGTGC^GGACATCACCAT^toCtfTC 
CCTGCTCAGCTTCACCATCGTCAGCCTCATGTACTTCGCCTTTACCAGGGATGACTCTGTTCCAGAA 
GACAACATCTGGAGAGGCATCCTCTCTGTTATTTTCTTCTTTCTTATCATCAGTGTGTTAGCTTTCC 
CCAATGGTCCGTTCACTCGACCTCATCCAGCCTTATGGCGAATGGTTTTTGGACTCAGTGTGCTCTA 
CTTCCTGTTCCTGGTATTCCTACTCTTCCTGAATTTCGAGCAGGTTAAATCTCTAATGTATTGGCTA 
GATCCAAATCTTCGATACGCCACAAGGGAAGCAGATGTCATGGAGTATGCTGTGAACTGCCATGTGA 
TCACCTGGGAGAGGATTATCAGCCACTTTGATATTTTTGCATTTGGACATTTCTGGGGCTGGGCCAT 
GAAGGCCTTGCTGATCCGTAGTTACGGTCTCTGCTGGACAATCAGTATTACCTGGGAGCTGACTGAG 
CTCTTCTTCATGCATCTCCTCCCCAATTTTGCCGAGTGCTGGTGGGATCAAGTCATTCTGGACATCC 
TGTTGTGCAATGGCGGTGGCATTTGGCTGGGCATGGTCGTTTGCCGGTTTTTAGAGATGAGGACTTA 
CCACTGGGCAAGCTTCAAGGACATTCATACCACCACCGGGAAGATCAAGAGAGCTGTTCTGCAGTTC 
ACTCC TGC TAGC TGGAC C TATGTTCG ATGGTTTGACCCC AAATCTTC TT TTC AG AGAGTAGCTGGAG 
TGTAC CTT TTC ATGATC ATC TGGC AGCTGAC TGAGTTGAATACCTTC TTC TTGAAGCATATCTTTGT 
GTTCCAAGCCAGTCATCCATTAAGTTGGGGTAGAATTCTCTTTATTGGTGGCATCACAGCTCCCACA 
GTGAGACAGTACTACGCTTACCTCACCGACACACAGTGCAAGCGCGTAGGAACACAATGCTGGGTGT 
TTGGGGCTTTCACCACTTTCCTCTGTCTGTACGGCATGATTTGGTATGCAGAACACTATGGTCACCG 
AGAAAAGACCTACTCGGAGTGTGAAGATGGCACCTACAGTCCAGAGATCTCCTGGCATCACAGGAAA 
GGGAC AAAAGGTTCTGAAGAC AGCCCAC C C AAGCATGCAGGC AAC AACGAAAGCCATTC TTCC AGGA 
GAAGGAATCGGCATTCCAAGTCAAAAGTCACCAATGGCGTTGGAAAGAAATGAAAAACCCTGGTTAA 



TCAAAGATGTTCCAGAGTGCCTAGAACTGAGAGGGAAATGGAACTCATTTGGAACTCCCCGTGAGGA 



GGTCGAGGCGCACAGGGCAAGCAGGAAGAGGCGAGGGCACTTGGGGGTCATTATTTGAGATCGTAAG 



TCTTGTTTCCCACAGACCTGGCCGCGTCAGGCAGATCATCGCCTGGGGGGCCTTTGCCAACGTGGGG 
TCTCTTCTAACTTCAGCACTTGACATGCGGTCACCGGTGGCAGCGCGGTGTGTTGAAGGGAAACGGT 
AGCTATTCATTCACAGTTGCCAAGAGCAGCTCCGCGCCTGCTGGATCGTGGATGCAGCGTAAACATC 



TTCCTTCAGACGAGGCATTAACCCCATGGTTAATGGACTGGTCACCAGTTTTTATTTTATTTTTATG 
AATCTACCTTTCCATTGATTGATTTAAGTTCAGGCCACTTTTCTGTCTTTTATTTGGTTACTGTTGT 
TATTTGTT TTTAAGTTAGG ATGCTTTTTAAC AGCC TTTAGAAGC CGCTGCTG AAATTGATAC TGGGG 



GAAGGGTTCCCCTTCCTTCTAGAGCAGAAAAGGGAGAGAAGTGTTGTATTCCTGTTTGGTAACCTCA 
GTCTCCTGTAAGACC TCCTACC ACATGGCGAG TATAC AC C AATC AGGAGAGGGTAGCTGCC TGC ATA 
GGAGC CTCGC TTCCG ATTATTCCCTTCC C AATATTATTC ATCC AG ACTTAGCC ACAGTGCACAAAAG 



CAAACCTGCTAGAGAGGCAGTGAACACCACAGCTTCTCCCCAGCTTGGTGCCTTTTACATCGGGTTT 
GTTCTCCTTCCATGGTGTGTTGCTGACATTGTCACTGAGTCCCATGTGAGGTGCTGGTGAGTATTAC 
CTTTCATCTGTGCCATGCTCTAGAACCTTGACCTTGATAGTTCACCACGTCTGATGGATCCCTGTTT 
TAAATAAAAACGATTCACTTTAAAGCCT 



ORF Start: ATG at 4 



|ORF Stop: TGAat 1324 





SEQIDNO: 182 


440 aa |MW at 51772.5kD 


NOV43a, 
CG152256-01 
Protein Sequence 


mascvgsrtlskddvnykmhfrmineqqveditidf^ 
niwrgilsvifffliisviafpngpftrphpalwriotglsv^^ 

pnlryatreadvme yavnchvt twer 1 1 shfdi f afghfwgwamkall irs yglcwt i s itwei/tel 

FFMHLIjPNFAECWWDQVILDILLCNGGGIWLG 

paswtyvrwfdpkssfqrvagvylfmiiwqltelntfflkhifvfqashplswgrilfiggitaptv 

rqyyayltdtqckrvgtqc^fgafttflciiygmivirym^ 

tkgsedsppkhagkneshssrrrnrhskskvtngvgkk 



Further analysis of the NOV43a protein yielded the following properties shown in 
Table 43B. 



Table 43B. Protein Sequence Properties NOV43a 


PSort analysis: 


0.6000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3000 probability located in endoplasmic reticulum (membrane); 
0.0300 probability located in mitochondrial inner membrane 


SignalP analysis: 


No Known Signal Sequence Predicted 
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. A search of the NOV43a protein against the Gendke^*aba!s&Fa Jrf^rfetfify* 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 43C. 



Table 43C. Geneseq Results for NOV43a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV43a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABB89640 


Human polypeptide SEQ ID 
NO 2016 - Homo sapiens, 
473 aa. [WO200190304-A2, 
29-NOV-2001] 


1..440 
1..473 


440/473 (93%) 
440/473 (93%) 


0.0 


AAB58945 


Breast and ovarian cancer 
associated antigen protein 
sequence SEQ ID 653 - 
Homo sapiens, 516 aa. 
[WO200055173-A1, 
21-SEP-2000] 


1..440 
44..516 


439/473 (92%) 
439/473 (92%) 


0.0 


ABB71324 


Drosophila melanogaster 
polypeptide SEQ ED NO 
40764 - Drosophila 
melanogaster, 498 aa. 
[WO200171042-A2, 
27-SEP-2001] 


3.359 
59..412 


206/357 (57%) 
276/357 (76%) 


e-133 


AAB73515 


Human transferase HTFS-22, 
SEQEDNO:22-Homo 
sapiens, 487 aa. 
[WO200132888-A2, 
10-MAY-2001] 


22.361 
45.389 


128/351(36%) 
185/351 (52%) 


2e-60 


AAM79907 


Human protein SEQ ED NO 
3553 - Homo sapiens, 529 aa. 
[WO200157190-A2, 
09-AUG-2001] 


22.361 
63..407 


128/351 (36%) 
185/351 (52%) 


2e-60 



In a BLAST search of public sequence datbases, the NOV43a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 43D. 



Table 43D. Public BLASTP Results for NOV43a 
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Protein 

Accession 

Number 


Protein/Organism/Length 


NOV43a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


.•' j& JL .da .,<" 

Expect 
Value 


P48651 


Phosphatidylserine synthase I 
(Serine-exchange enzyme I) 
(EC 2.7.8.-) - Homo sapiens 
(Human), 473 aa. 


1..440 

t A TO 

1..473 


440/473 (93%) 
440/473 (93%) 


0.0 


Q99LH2 


Similar to phosphatidylserine 
synthase 1 - Mus musculus 
(Mouse), 473 aa. 


1..440 
1..473 


428/473 (90%) 
437/473 (91%) 


0.0 


Q00576 


Phosphatidylserine synthase I 
(Serine-exchange enzyme I) 
(EC 2.7.8.-) - Cricetulus 
longicaudatus (Long-tailed 
hamster) (Chinese hamster), 
471 aa. 


1..440 
1..471 


428/473 (90%) 
434/473 (91%) 


0.0 




rnospnauayisenne synmase-i 
- Mus musculus (Mouse), 473 
aa. 


1 AACi 
I..44U 

1..473 


>ioi iaii /one/ \ 
4Z1/4/3 (oy7<?) 

432/473 (91%) 


A A 
U.U 


Q9BSY0 


Similar to phosphatidylserine 
synthase 1 - Homo sapiens 
(Human), 334 aa (fragment). 


145..440 
6..334 


292/329 (88%) 
293/329 (88%) 


e-178 



PFam analysis predicts that the NOV43a protein contains the domains shown in the 
Table 43E. 



Table 43E. Domain Analysis of NOV43a 


Pfam Domain 


NOV43a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


COLFI 


119.. 137 


10/19 (53%) 
14/19(74%) 


0.12 


PSS 


96..370 


179/310(58%) 
267/310(86%) 


l.le-206 



Example 44. 

The NOV44 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 44A. 
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Table 44A. NOV 


44 Sequence Analysis 




SEQK>NO:183 1151 bp [ 


NOV44a, 
CG171804-01 


CNTGNATTTGGCCGK5GGGGCCATGTAGCTCCGAGCGGCGGATCGCGAGCCTCCTCCGAACrrrAf^ 
TGCACGCCCGGTTAGC ATTCGGCCGGGAGATGCGGC AGTGGAATC TGGAAGGGCGGTGAAAAArr T a 
CGTCC TGC CC TCGCC CGGC C TC TCCATTCGTCCCCCGGGTAGAGAGGGTCGGCTCGTGCTf" ATf A Tr 1 


DNA Sequence 

— 


CTGTGCTCCGTGGTCTTCTCTGCCGTCTACATCCTCCTGTGCTGCTGGGCCGGCCTGTCCCTCTGrr 
TGGCCACC TGCCTGGACCACC AC TTC CCCACAGGCTCC AGGCCCACTGTGPrGGG a r PPPwy a nnwn 

CAGTGGATATAGCAGTGTGCCAGATGGGAAGCCGCTGGTCCGCGAGCCCTGCCGCAGCTGTGCCGTG 


GTGTCCAGCTCCGGCCAAATGCTGGGOTnAGGrrTnraTO^ 

TCCGCATGAACCAGGCGCCCACCGTGGGCTTTGAGGCGGATGTGGGCCAGCGCAGCACCCTGCGTGT 
CGTCTCACACACAAGCGTGCCGCTGCTGCTGCGCAACTATTCACACTACTTCCAGAAGGCCCGAGAC 
ACGCTCTACATGGTGTGGGGCCAGGGCAGGCACATGGACCGGGTGCTCGGCGGCCGCACCTACCGCA 
CGC TGCTGCAGC TCACCAGGATGTACCCCGGCC TGCAGGTGTAC AC C TTC ACGGAGCGCATGATGGC 
CTACTGCGACCAGATCTTCCAGGACGAGACGGGCAAGAACCGGAGGCAGTCGGGCTCCTTCCTCAGC 
ACCGGCTGGT TCACCATGATCCTCGCGC TGGAGCTGTGTGAGGAGATCGTGGTC TATGGGATGGTCA 
GCGACAGCTACTGCAGGGAGAAGAGCCACCCCTCAGTGCCTTACCACTACTTTGAGAAGGGCCGGCT 
AGATGAGTGTCAGATGTACCTGGCACACGAGCAGGCGCCCCGAAGCGCCCACCGCTTCATCACTGAG 
AAGGCGGTC TTC TCCCGCTGGGCCAAGAAGAGGCC CATCGTGTTCGC CCATCCGTCC TGGAGGACTG 
AGTAGCTTCCGTCGTCCTGCCAGCCGCCATGCCGTTGCGAGGCCTCCGGGATGTCCCATrrrAAGrr 


ATCACACTCCAC 


JORF Start: ATG at 421 ORF Stop: TAG at 1075 



ISEQIDNO: 184 El8^ 



MWat25333.8kD 



NOV44a, Imlg s glg ae i d s aec vfrmnqap tvg feadvg qrs tlr wshts vpl llrnys hytqkardtl ymvw 

rn 1 7 1 RfU-ni Jgqgrhmdrvlggrtyrtllqltrmypglqvytftermmaycijqifqdetgkni^ 

^ ' lOW-Ul JILALELCEEIVWGMVSDSYCREKSHPSVPYHYFEKGRLDECQMYLAJIEQ 

Protein Sequence Iwakkrpivfahpswrte 



10 Further analysis of the NO V44a protein yielded the following properties shown in 

Table 44B. 



Table 44B. Protein Sequence Properties NOV44a 


PSort analysis: 


0.6400 probability located in microbody (peroxisome); 0.3000 probability 
located in nucleus; 0.2068 probability located in lysosome (lumen); 0. 1000 
probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 



15 

A search of the NOV44a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 44C. 
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Table 44C. Geneseq Results for NOV44a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV44a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
iviatcned Kegion 


Expect 
Value 


AAB75350 


Human secreted protein #9 - 
Homo sapiens, 302 aa. 
[WO200100806-A2, 
04-JAN-2001] 


1..218 
85..302 


218/218(100%) 
218/218 (100%) 


e-128 


AAB61614 


Human protein HP03380 - 
Homo sapiens, 302 aa. 
[WO200102563-A2, 
ll-JAN-2001] 


1..218 
85..302 


218/218 (100%) 
218/218 (100%) 


e-128 - 


AAJdZj 7o4 . 


Human secreted protein SEQ 
ID #76 - Homo sapiens, 302 
aa. [WO200037491-A2, 
29-JUN-2000] 


1..218 
85..302 


218/218 (100%) 
218/218 (100%) 


e-128 


AAB28674 


Human 

carbohydrate-modifying 

ciiz.yi.uc uicyic wj aNO. 

983984CD1 - Homo sapiens, 
302 aa. [WO200063351-A2, 
26-OCT-2000] 


1..218 

85..302 


218/218 (100%) 
218/218 (100%) 


e-128 


AAB24495 


Human secreted protein 
sequence encoded by gene 5 
SEQ ID NO: 120 -Homo 
sapiens, 345 aa. 
[WO200035937-A1, 
22-JUN-2000] 


1..218 
128..345 


217/218 (99%) 
217/218 (99%) 


e-128 



In a BLAST search of public sequence datbases, the NOV44a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 44D. 



Table 44D. Public BLASTP Results for NOV44a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV44a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 
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Q9H4F1 


Alpha-N-acetyl-neuraminyl-2,3-beta-gala 
ctosyl-l,3-N- acetylgalactosaminide 
alpha-2,6-sia!yItransferase (EC 2.4.99.7) 
(NeuAc-alpha-2,3-Gal-beta-l ,3-GalNAc-a 
lpha-2, 6-sialyltransferase) (ST6GalNAc 
IV) (Sialyltransferase 7D) - Homo sapiens 
(Human), 302 aa. 


ft 

1..218 

85..302 


218/218 
(100%) 
218/218 
(100%) 


e-128 


Q9H4F1 


Alpha2,6-sialyltransferase - Homo sapiens 
(Human), 302 aa. 


1..218 
85..302 


217/218 (99%) 
218/218 (99%) 


e-128 


Q9NWU6 


CDNA FLJ20593 fis, clone KAT08984 - 
Homo sapiens (Human), 302 aa. 


1..218 
85..302 


217/218 (99%) 
217/218 (99%) 


e-127 


Q9UKU1 


NeuAc-alpha-2,3-Gal-beta-l,3-GalNAc-al 
pha-2, 6-sialyltransferase 
alpha2,6-sialyltransferase - Homo sapiens 
(Human), 302 aa. 


L.218 
85..302 


216/218 (99%) 
216/218 (99%) 


e-127 


Q9R2B6 


Alpha-N-acetyI-neuraminyl-2,3-beta-gala 
ctosyl-l,3-N- acetylgdactosaminide 
a!pha-2,6-sialyltransferase (EC 2.4.99.7) 
(NeuAc-alpha-2,3-Gal-beta-l ,3-GalNAc-a 
lpha-2, 6-sialyltransferase) (ST6GalNAc 
IV) (Sialyltransferase 7D) - Mus musculus 
(Mouse), 360 aa. j 


1..218 
143..360 


202/218 (92%) 
207/218 (94%) 


e-118 



PFam analysis predicts that the NOV44a protein contains the domains shown in the 
Table 44E. 



Table 44E. Domain Analysis of NOV44a 


Pf am Domain 


NOV44a Match 
Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


GIyco_transf_29 


1..202 


65/324(20%) 
184/324 (57%) 


6e-43 



Example 45. 

The NOV45 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 45A. 



Table 45A. NOV45 Sequence Analysis 
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» 


SEQ 3D NO* 185 |l475bp ^* ^"^ ^ '""^ ^ " 


NOV45a, 
CG171841-01 
DNA Sequence 


AGGACTCCAAGCGCCATGGCCGCTGCCGCCCGAGeCCGGGTCGCGTACTTGCTGAGGCAACTGCAAC 


GCGCAGCATGGCTGTTTCAAATATTAGATATGGAGCAGCAGTTACAAAGGAAGTAGGAATGGCAGAC 
CTAAAAAACATGGGTGCTAAAAATGTGTGCTTGATGACAGACAAGAACCTCTCCAAGCTCCCTCCTG 
TGCAAGTAGCTATGGATTCCCTAGTGAAGAATGGCATCCCCTTTACGGTTTATGATAATGTGAGAGT 
GGAACCAACGGATAGC T TC ATGGAAGCTATTGAGTTTGCCC AAAAGGGAGC TTTTGATGCC TATGTT 
GCTGTCGGTGGTGGCTCTACCATGGACACCTGTAAGGCTGCTAATCTGTATGCATCCAGCCCTCATT 
CTGATTTCCTAGATTATGTCAGTGCCCCCATTGGCAAGGGAAAGCCTGTGTCTGTGCCTCTTAAGCC 
TCTGATTGCAGTGCCAACTACCTCAGGAACCGGGAGTGAAACTACTGGGGTTGCCATTTTTGACTAT 
GAACACTTGAAAGTAAAAATTGGCATCACTTCGAGAGCCATCAAACCCACACTGGGACTGATTGATC 
CTCTGCACACCCTCCACATGCCTGCCCGAGTGGTCGCCAACAGTGGCTTTGATGTGTTTAGCCATGC 
CC TGGAGTC ATACAC C ACCC TGCCCTACC ACCTGCGGAGCC CCTGCCC TTC AAATCC CATC ACACGG 
CCTGCGTACC AGGGC AGCAACCCAATC AGTG AC ATTTGGGC TATCC ACGCGCTGCGGATCGTGGC TA 
AGTATCTGAAGGCTGTCAGAAATCCCGATGATCTTGAAGCAAGGTCTCATATGCACTTGGCAAGTGC 
TTTTGCTGGCATCGGCTTTGGAAATGCTGGTGTTCATCTGCATGGAATGTCTTACCCAATTTCAGGT 
TTAGTGAAGATGTATAAAGCAAAGGATTACAATGTGGATCACCCACTGGTGCCCCATGGCCTTTCTG 
TGGTGCTCACGTCCCCAGCGGTGTTCACTTTCACGGCCCAGATGTTTCCAGAGCGACACCTGGAGA^ 
GGCAGAACTTCTAGGAGCCGACACCCGCACTGCCAGGATCCAAGATGCAGGGCTGGTGTTGGCAGAC 
ACGC TCCGGAAATTC TTATTCGATCTGGATGTTGATGATGGCCTAGCAGCTGTTGGT TACTCCAAAG 
CTGATATCCCCGCACTAGTGAAAGGAACGCTGCCCCAGGAAAGGGTCACCAAGCTTGCACCCTGTCC 
CCAGTC AGAAGAGGATCTGGC TGCTC TGTTTGAAGCTTCAATGAAACTGTATTAATTGTCATTTTAA 
C TGAAAGAATTAC CGCTGGCCATTGTAGTGCTGAGAGC AAGAGC TGATCTAGC TAGGGCTTTGTCTT 




TTC ATC TTTGCGC AT AACTTACC TGTTACC AGTATAGGTGGGATATAC ATTTATCTTGCAGGAAATT 




C 




ORF Start: ATG at 75 f jORF Stop: TAA at 1326 



|SEQE>NO:186 |417aa 


MWat44871.2kD 


NOV45a, 
CG171841-01 
Protein Sequence 


MAVSNIRYGAAVTKWGMADLKNMGAKNVC LMTDKNLSKLP PVQVAMDSIiVKNG I PFTVYDNVRVE P 
TDSFMEAIEFAQKG AFDAYVAVGGGS TMDTCKAANL YAS S PHSDFLDYVSAP IGKGKPVS VPLKPL I 
AVPTTSGTGSETTGVAIFDY^HLKVKIGITSRAIKPTLGIiIDPLHTLHMPARWANSGFDVFSHALE 
SYTTLPYHLRSPCPSNPITRPAYQGSWPISDIWAIHALRIVAKYLKAVRNPDDLEARSHMHLASAFA 
GIGFGNAGVHLHGMSYPISGLVKMY1CAKDYNVDHPLVPHGLSVVLTS 

LLGADTRTARI QDAGLVLADTLRKFLFDLDVDDGLAAVGYSKAD I PALVKGTLPQERVTKLAPCPQS 
EEDLAALFEASMKLY 



Further analysis of the NOV45a protein yielded the following properties shown in 
Table 45B. 



Table 45B. Protein Sequence Properties NOV45a 


PSort analysis: 


0.4500 probability located in cytoplasm; 0.3188 probability located in 
microbody (peroxisome); 0.2355 probability located in lysosome (lumen); 
0.1000 probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOV45a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 45C. 
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Table 45C. Geneseq Results for NOV45a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV45a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 




Human dehydrogenase 
DHDR-6 protein - Homo 
sapiens, 467 aa. 
[WO200216562-A2, 
28-FEB-2002] 


1 All 

1..41 / 
49..467 


41 J/42U (9o%) 
414/420(98%) 


0.0 


AAB73686 


Human oxidoreductase 
protein UKr-ly - Homo 
sapiens, 467 aa. 
[WO200144448-A2, 
21-JUN-2001] 


1..417 
49..467 


412/420(98%) 
413/420 (98%) 


0.0 


ABB59876 


Drosophila melanogaster 
polypeptide SEQ ID NO 
6420 - Drosophila 
melanogaster, 464 aa. 
[WO200171042-A2, 
27-SEP-2001] 


1..417 
46..464 


254/420 (60%) 
327/420 (77%) 


e-146 




^vT/w/aI iiTY\n r> m o ern Actio 

xNuvci iiuiiian diagnostic 
protein #8084 -Homo 
sapiens, 268 aa. 
[WO200175067-A2, 
ll-OCT-2001] 


1..268 


243/268 (90%) 


e-ijl 


AAB42855 


Human ORFXORF2619 
polypeptide sequence SEQ 
ID NO:5238 - Homo sapiens, 
212 aa. [WO200058473-A2, 
05-OCT-2000] 


247..417 
41..212 


168/172 (97%) 
170/172 (98%) 


7e-91 



In a BLAST search of public sequence datbases, the NOV45a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 45D. 



Table 45D. Public BLASTP Results for NO V45a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV45a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 
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CAD28993 


Sequence 4 from Patent 
WO0216562 - Homo sapiens 
(Human), 467 aa. 




1..417 
49.467 


L# (I ""CP Z:<& !LS iC. . 
413/420 (98%) 

414/420 (98%) 


».Jt « u »ir ^ 

0.0 


Q96MF9 


CDNA FLJ32430 fis, clone 
SKMUS20Q1 129, weakly 
similar to NAD-dependent 
methanol dehydrogenase (EC 
1.1.1 .244) - Homo sapiens 
(Human), 419 aa. 


1..417 
L.419 


412/420 (98%) 
413/420 (98%) 


0.0 


Q8R0N6 


Hypothetical 45.0 kDa 
protein - Mus musculus 
(Mouse), 419 aa. 


1..417 
1..419 


372/420 (88%) 
394/420 (93%) 


0.0 


Ijy WZDD 


1 3L>Jti protem - DrosopniJa 
melanogaster (Fruit fly), 464 
aa. 


1 All 

1..417 
46..464 


Offyl IA*S(\ i£.C\Ot \ 

254/420 (60%) 
327/420 (77%) 


e-145 


Q95S86 


GM05887p - Drosophila 
melanogaster (Fruit fly), 425 
aa. 


1..417 
7..425 


254/420 (60%) 
327/420 (77%) 


e-145 



PFam analysis predicts that the NOV45a protein contains the domains shown in the 
Table 45E. 



Table 45E. Domain Analysis of NOV45a 


Pfam Domain 


NOV45a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Fe-ADH 


4..205 


68/216 (31%) 
143/216 (66%) 


5.6e-28 


Fe-ADH 


228..288 


30/68 (44%) 
51/68 (75%) 


2.5e-10 



Example 46. 

The NOV46 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 46A. 
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Table 46 A. NOV46 Sequence Analysis 




SEQIDNO: 187 


1310 bp J 


NOV46a, 
CG173017-01 
DNA Sequence 


CTACTCTCAGCCAGGAATCATGTCTTG(MCCGCTCGCCCGCCCTTCCTrnrTrAr:rr:r:rATOrrr:P2\ 
GGGCAGTGTGGGCCGGTGGGGGTGCGAAAAGAAATGCATTGTGGGGTCGCGTCCCGGTGGCGGCGGC 
GACGGCCCTGGCTGC^TCCCGCAGCGGCGGCGGCGGCC^CGGTGGCAGGCGGAGAACAACAAACCCC 
GGAGCCGGAGCCAGGGGAGGC TGGACGGG ACGGGATGGGCGACAGCGGGC G GGGTGGC CC TGGGGCT 
GGCAAAC GGC T ATG TGCAATC TGC GGGG AC AG AAG C TC AGG C AAACAC TAC GGGG TT T ACAG CTG TG 
AGGGTTGCAAGGGCTTCTTCAAACGCACCATCCGCAAAGACCTTACATACTCTTGCCGGGACAACAA 
AGACTGCACAGTGGACAAGCGCCAGCGGAACCGCTGTCAGTACTGCCGCTATCAGAAGTGCCTGGCC 
ACTGGCATGAAGAGGGAGGCGGTACAGGAGGAGCGTCAGCGGGGAAAGGACAGGGATGGGGATGGGG 
AGGGGGCTGGGGGAGCCCCCGAGGAGATGCCTGTGGACAGGATCCTGGAGGCAGAGCTTGCTGTGGA 
ACAGAAGAGTGACCAGGGCGTTGAGGGTCCTGGGGGAACCGGGGGTAGCGGCAGCAGCCCAAATGAC 
CCTGTGACTAACATCTGTCAGGCAGCTGACAAACAGCTATTCACGCTTGTTGAGTGGGCGAAGAGGA 
TCCCACACTTTTCCTCCTTGCCTCTGGATGATCAGGTCATATTGCTGCGGGCAGGCTGGAATGAACT 
CCTCATTGCCTCCTTTTCACACCGATCCATTGATGTTCGAGATGGCATCCTCCTTGCCACAGGTCTT 
CACGTGCACCGCAACTCAGCCCATTCAGCAGGAGTAGGAGCCATCTTTGATCGGGTGCTGACAGAGC 
TAGTGTCCAAAATGCGTGACATGAGGATGGACAAGACAGAGCTTGGCTGCCTGAGGGCAATCATTCT 
GTTTAATCCAGATGCCAApGGCCTCTCCAACCCTAGTGAGGTGGAGGTCCTGCGGGAGAAAGTGTAT 
GCATCACTGGAGACCTACTGCAAACAGAAGTACCCTGAGCAGCAGGGACGGTTTGCCAAGCTGCTGC 
TACGTCTTCCTGCCCTCCGGTCCATTGGCCTTAAGTGTCTAGAGCATCTGTTTTTCTTCAAGCTCAT 
TGGTGACACCCCCATCGACACCTTCCTCATGGAGATGCTTGAGGCTCCCCATCAACTGGCCTGAGCT 
CAGACCCAGACGTGGTGCTTCTCCACACTGGAGGAGC 


|ORF Start: ATG at 20 j 


ORF Stop: TGA at 1268 



5 





SEQ ID NO: 188 J416 aa |mW at 45778 JkD 


NOV46a, 
CG173017-01 
Protein Sequence 


MSWAARPPFLPQRHAAGQCGPVGVRKEMHCGVASRWRRRRPWLDPAAAAAAAVAGGEQQTPEPEPGE 
AGRDGMGDSGRGG PGAGKRLCAI CGDRS SGKHYGVYSCEGCKGFFKRT IRKDLT YSCRDNKDCTVDK 
RQRTOCQYCRYQKCLATGMKREAVQEERQRGKDRDGDGEGAGGAPEEMPVDRILEAELAVEQKSDQG 
VEGPGGTGGSGSSPNDPVTNICQAADKQLFTLVEW^^ 

HRS IDVRDGILLATGLHVHRNSAHSAGVGAIFDRVLTELVSKMRDMRMDKTELGCLRAI ILFNPDAK 
GLSNP S EVEVLREKVYASLETYCKQKYPEQQGRFAKLLLRLPALRS IGLKCLEHLFFFKL IGDT P ID 
TFLMEMLEAPHQLA 



10 Further analysis of the NOV46a protein yielded the following properties shown in 

Table 46B. 



Table 46B. Protein Sequence Properties NOV46a 


PSort analysis: 


0.9700 probability located in nucleus; 0.3000 probability located in 
microbody (peroxisome); 0.1000 probability located in mitochondrial matrix 
space; 0.1000 probability located in lysosome (lumen) 


SignalP analysis: 


No Known Signal Sequence Predicted 



15 

A search of the NOV46a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 46C. 
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Table 46C. Geneseq Results for NOV46a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV46a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAU78297 


Human Retinoid X Receptor 
beta (RXRbeta) protein - 
Homo sapiens, 533 aa. 
[WO200218420-A2, 
07-MAR-2002] 


41..416 
156..533 


346/378(91%) 
352/378 (92%) 


0.0 


AAR72483 

■ 


Human H-2RDBP - Homo 
sapiens, 533 aa. 
(US5403925-A, 
04-APR-1995] 


41..416 
156..533 


346/378 (91%) 
352/378 (92%) 


0.0 


AAR39468 


hRXR-betal -Homo sapiens, 
533 aa. [W09315216-A, 
05-AUG-1993] 


41..416 
156..533 


346/378 (91%) 
352/378 (92%) 


0.0 


AAR39469 


hRXR-beta2 - Homo sapiens, 

510aa.[WO9315216-A, 

05-AUG-1993] 


41.416 
133..510 


345/378 (91%) 
351/378 (92%) 


0.0 


AAY21625 


Ligand binding domain of 
nuclear receptor hRXRbeta - 
Homo sapiens, 525 aa. 
[W09926966-A2, 
03-JUN-1999] 


41..416 
148..525 


345/378 (91%) 
351/378 (92%) 


0.0 



In a BLAST search of public sequence datbases, the NO V46a protein was found to 
have homology to the proteins shown in the BLASTP data in Tabte 46D. 



Table 46D. Public BLASTP Results for NOV46a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV46a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


S37781 


retinoid X receptor beta - 
human, 533 aa. 


41..416 
156..533 


346/378 (91%) 
352/378 (92%) 


0.0 


Q95L53 


Retinoid X receptor beta - 
Mustek vison (American 
mink), 525 aa (fragment). 


41..416 
148..525 


346/378 (91%) 
352/378 (92%) 


0.0 
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P28702 


Retinoic acid receptor 
RXR-beta - Homo sapiens 
(Human), 533 aa. 


■■■ " ' IF 

41..416 

156..533 


- Ik!:. !i ikj* Ltli (}Jt ifc; 
346/378 (91%) 

352/378 (92%) 


q& J2b .*? 


A41651 


retinoic acid receptor 
coregulator - rat, 451 aa. 


41..416 
74..451 


341/378 (90%) 
349/378 (92%) 


0.0 


D41727 


retinoid X receptor beta - 
mouse, 448 aa. 


41.416 
71..448 


341/378 (90%) 
349/378 (92%) 


0.0 



PFam analysis predicts that the NOV46a protein contains the domains shown in the 
Table 46E. 



Table 46E. Domain Analysis of NOV46a 


Pfam Domain 


NOV46a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


zf-C4 


86..161 


49/77 (64%) 
73/77(95%) 


1.5e-54 


hormone_jec 


227..409 


74/207 (36%) 
157/207(76%) 


3.3e-68 



Example 47. 

The NOV47 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 47A. 



Table 47A. NOV47 Sequence Analysis 




SEQIDNO:189 jl229bp 


NOV47a, 
CG173347-01 
DNA Sequence 


CCGAGACCATGGGGAAGCTCGTGGCGCTGGTCCTGCTGGGGGTCGGCCTGTCCTTAGTCGGGGAGAT 
GTTTCTGGCGTTTAGAGAAAGGGTGAATGCCTCTCGAGAAGTGGAGCCAGTAGAACCTGAAAACTGC 
CACCTTATTGAGGAACTTGAAAGTGGCTCTGAAGATATTGATATACTTCCTAGTGGGCTGGCTTTTA 
TCTCCAGTCTGCAGGTCTGTTGGAGTTTGCTGGAAGTCCACTCCAGACCCTGTTTGCCTGGGTATCA 
CCAGTGGAGGCTGGAGAACGGGAAATATTGCTGCCTGACT 

GGCATCCGCCTGTATGAGGGATTAAAATATCCAGGCATGCCAAACTTTGCGCCAGATGAACCAGGAA 
AAATCTTCTTGATGGATCTGAATGAACAAAACCCAAGGGCACAAGCACTAGAAATCAGTGGTGGATT 
TGACAAAGAATTATTTAATCCACATGGGATCAGTATTTTCATCGACAAAGACAATACTGTGTATCTT 
TATGTTGTGAATCATCCCCACATGAAGTCCACTGTGGAGATATTTAAATTTGAGGAACAACAACGTT 
CTCTGGTATACCTGAAAACTATAAAACATGAACTTCTCAAAAGTGTGAATGACATTGTGGTTCTTGG 
ACCAGAACAGTTCTATGCCACCAGAC^CCACTATTTTACCi^CTCCCTCCTGTCATTTTTTGAGATG 
ATC TTGGATC TTCGCTGGAC TTATGTTC TTTTCTAC AGCCC AAGGGAGGTTAAAGTGGTGGCCAAAG 
GATTTTGTAGTGCCAATGGGATCACAGTCTCAGCAGACCAGAAGTATGTCTATGTAGCTGATGTAGC 
AGCTAAGAACATTCACATAATGGAAAAACATGATAACTGGGATTTAACTCAACTGAAGGTGATACAG 
TTGGGCACCTTAGTGGATAACCTGACTGTCGATCCTGCCACAGGAGACATTTTGGCAGGATGCCATC 
CTAATCCTATGAAGCTAC TGAAC TATAACCCTGAGGACCC TCC AGGATCAGAAGTACTTCGCATCCA 
GAATGTTTTGTCTGAGAAGCCCAGGGTGAGCACCGTGTATGCCAACAATGGCTCTGTGCTTCAGGGC 
ACCTCTGTGGCTTCTGTGTACCATGGGAAAATTCTCATAGGCACCGTATTTCNCAAAACTCTGTACT 
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■ 


CiTGAGCTCTAGACTC TAGATAGT ^ U 1 U vWI > . 




ORF Start: ATG at 9 j |ORF Stop: TAG at 1215 






SEQ ID NO: 190 |402 aa MW at 45 160.5kD 


NOV47a, 
CG173347-01 
Protein Sequence 


MGKLVALVLLGVGLSLVGE^LAFRERVl^REVEPVEPENCHLIEELESGSEDIDILPSGLAFISS 
LQVCWSLLEVHSRPCLPGYHQWRLQNGKYCCLIFLLEASSQRGIRLYEGLKYPGMPNFAPDEPGKIF 
LMDLNEQNPRAQALEI SGGFDKELFNPHGIS IFIDKI)NTVYLYVVNHPHMKSTVEIFKFEEQQRSLV 
YLKTBUffiLLKSVNDrVVLGPEQFYATRDH^ 

SANGIWSADQKYVYVADVAAKNIHIMEKHDNWDLTQLKVIQLGTL 
MKLLNYNPEX3PPGSEVLRIQNVLSEKPRVSTWANNGSVLQGTSVASVYHGKILIGTW 



Further analysis of the NOV47a protein yielded the following properties shown in 
Table 47B. 

10 



Table 47B. Protein Sequence Properties NOV47a 


PSort analysis: 


0.8200 probability located in outside; 0.1900 probability located in lysosome 
(lumen); 0.1000 probability located in endoplasmic reticulum (membrane); 
0.1000 probability located in endoplasmic reticulum (lumen) 


SignalP analysis: 


Cleavage site between residues 31 and 32 



A search of the NOV47a protein against the Geneseq database, a proprietary 
15 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 47C 



Table 47C- Geneseq Results for NOV47a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV47a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABB97287 


Novel human protein SEQ ID 
NO: 555 - Homo sapiens, 354 
aa. [WO200222660-A2, 
21-MAR-2002] 


1..402 
1.354 


352/402(87%) 
352/402(87%) 


0.0 


AAG75494 


Human colon cancer antigen 
protein SEQ ID NO:6258 - 
Homo sapiens, 370 aa. 
[WO200122920-A2, 
05-APR-2001] 


2..402 
18..370 


352/401 (87%) 
352/401 (87%) 


0.0 
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ABG0835O 


Novel human diagnostic 
protein #8341 -Homo 
sapiens, 382 aa. 
[WO200175067-A2, 
ll-OCT-2001] 


1 F 1 

1..402 
24..382 


L* 'II""/ UrSUiE;;' 
330/407 (81%) 

333/407 (81%) 


'■■'■■'■rO.'rR 1 ./'" 
e-178 


AAU11925 


Protein sequence of rabbit 
paraoxonase-3 (PON3) 
mutant D324N - Oryctolagus 
cuniculus, 355 aa. 
[WO200190336-A2, 
29-NOV-2001] 


1..402 
1..355 


294/403 (72%) 
318/403 (77%) 


e-164 


AAU11922 


Protein sequence of rabbit 
paraoxonase-3 (PON3) 
mutant N169D - Oryctolagus 
cuniculus, 355 aa. 
[WO200190336-A2, 
29-NOV-2001] 


1..402 
1..355 


294/403 (72%) 
318/403 (77%) 


e-164 



In a BLAST search of public sequence datbases, the NOV47a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 47D. 



Table 47D. Public BLASTP Results for NOV47a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV47a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 


Q15166 


Serum 

paraoxonase/arylesterase 3 
(EC 3.1.1.2) (EC 3.1.8.1) 
(PON 3) (Serum • 
aryldiakylphosphatase 3) 
(A-esterase 3) (Aromatic 
esterase 3) - Homo sapiens 
(Human), 354 aa. 


1..402 
1..354 


354/402 (88%) 
354/402 (88%) 


0.0 


Q9BZH9 


Paraoxanase-3 - Homo 
sapiens (Human), 354 aa 
(fragment). 


1..402 
1..354 


351/402 (87%) 
351/402 (87%) 


0.0 


Q9BGN0 


Paraoxonase 3 - Oryctolagus 
cuniculus (Rabbit), 354 aa. 


1..402 
1..354 


293/402 (72%) 
318/402 (78%) 


e-164 
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Q62087 

■ 


Serum 

paraoxonase/arylesterase 3 
(EC 3.1.1.2) (EC 3.1.8.1) 
(PON 3) (Serum 
aryldiakylphosphatase 3) 
(A-esterase 3) (Aromatic 
esterase 3) - Mus musculus 
(Mouse), 354 aa. 


i ^ 

1..402 
1..354 


ILu k ? 'Ui? *~ t \s vj» iw < 

283/402 (70%) 
314/402(77%) 


e-158 


Q90952 


Serum 

paraoxonase/arylesterase 2 
(EC 3.1.1.2) (EC 3.1.8.1) 
(PON 2) (Serum 
aryldiakylphosphatase 2) 
(A-esterase 2) (Aromatic 
esterase 2) - Gallus gallus 
(Chicken), 354 aa. 


1..402 
1..354 


230/402 (57%) 
287/402(71%) 


e-131 



PFam analysis predicts that the NOV47a protein contains the domains shown in the 
Table 47E. 



Table 47E. Domain Analysis of NOV47a 


Pfam Domain 


NOV47a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Arylesterase 


2..402 


230/422(55%) 
348/422(82%) 


1.2e-190 



Example 48. 

The NOV48 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 48 A. 



Table 48A. NOV48 Sequence Analysis 




SEQ ID NO: 191 


2109 bp | 


NOV48a, 
CG56234-01 
DNA Sequence 


CCTTCCATACCTCCCCGGCTCCGCTCGGTTCCTGGCCACCCCGCAGCCCCTGCCCAGGTGCCATGGC 


CGCATTGTACCGCCCTGGCCTGCGGCTTAACTGGCATGGGCTGAGCCCCTTGGGCTGGCCATCATGC 
CGTAGCATCCAGACCCTGCGAGTGCTTAGTGGAGATCTGGGCCAGCTTCCCACTGGCATTCGAGATT 
TTGTAGAGCACAGTGCCCGCCTGTGCCAACCAGAGGGCATCCACATCTGTGATGGAACTGAGGCTGA 
GAATACTGCCACACTGACCCTGCTGGAGCAGCAGGGCCTCATCCGAAAGCTCCCCAAGTACAATAAC 
TGCTGGCTGGCCCGCACAGACCCCAAGGATGTGGCACGAGTAGAGAGCAAGACGGTGATTGTAACTC 
CTTCTCAGCGGGACACGGTACCACTCCCGCCTGGTGGGGCCCGTGGGCAGCTGGGCAACTGGATGTC 
CCCAGCTGATTTCCAGCGAGCTGTGGATGAGAGGTTTCCAGGCTGCATGCAGGGCCGCACCATGTAT 
GTGCTTCCATTCAGCATGGGTCCTGTGGGCTCCCCGCTGTCCCGCATCGGGGTGCAGCTCACTGACT 
CAGCCTATGTGGTGGCAAGCATGCGTATTATGACCCGACTGGGGACACCTGTGCTTCAGGCCCTGGG 
AGATGGTGACTTTGTCAAGTGTCTGCACTCCGTGGGCCAGCCCCTGACAGGACAAGGGGAGCCAGTG 
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* 


AGCCAGTGGCCGTGCAACCCAGAGAAAACC^ 

CCTTCGGCAGCGGCTATGGTGGCAACTCCCTGCXGGGCAAGAAGTGCTTTGCCCTACGCATCGCCTC 
TCGGCTGGCCCGGGATGAGGGCTGGCTGGCAGAGCACATGCTGATCCTGGGCATCACCAGCCCTGCA 
GGGAAGAAGCGCTATGTGGCAGCCGCCTTCCCTAGTGCCTGTGGCAA.GACCAACCTGGCTATGATGC 
GGCCTGCACTGCCAGGCTGGAAAGTGGAGTGTGTGGGGGATGATATTGCTTGGATGAGGTTTGACAG 
TGAAGGTCGACTCCGGGCCATCAACCCTGAGAACGGCTTCTTTGGGGTTGCCCCTGGTACCTCTGCC 
ACC ACCAATC CC AACGC C ATGGC TAC AATC CAGAG TAACAC T ATTTT TAC CAATGTGGCTG AGAC CA 

GTGATGGTGGCGTGTACTGGGAGGGCATTGACCAGCCTCTTCCACCTGGTGTTACTGTGACCTCCTG 
GCTGGGCAAACCCTGGAAATCTGGTGACAAGGAGCCCTGTGCACATCCCAACTCTCGATTTTGTGCC 
CCGGCTCGCCAGTGCCCCATCATGGACCCAGCCTGGGAGGCCCCAGAGGGTGTCCCCATTGACGCCA 
TCATCTTTGGTGGCCGCAGACCCAAAGGGGTACCCCTGGTATACGAGGCCTTCAACTGGCGTCATGG 
GGTGTTTGTGGGCAGCGCCATGCGCTCTGAGTCCACTGCTGCAGCAGAACACAAAGGGAAGATCATC 
ATGCACGACCCATTTGCCATGCGGCCCTTTTTTGGCTACAACTTCGGGCACTACCTGGAACACTGGC 
TGAGCATGGAAGGGCGCAAGGGGGCCCAGCTGCCCCGTATCTTCCATGTCAACTGGTTCCGGCGTGA 
CGAGGCAGGGCACTTCCTGTGGCCAGGCTTTGGGGAGAATGCTCGGGTGCTAGACTGGATCTGCCGG 
CGGTTAGAGGGGGAGGACAGTGCCCGAGAGACACCCATTGGGCTGGTACCAAAGGAAGGAGCCTTGG 
ATCTCAGCGGCCTCAGAGCTATAGACACCACTCAGCTGTTCTCCCTCCCCAAGGACTTCTGGGAACA 
GGAGGTTCGTGACATTCGGAGCTACCTGACAGAGCAGGTCAACCAGGATCTGCCCAAAGAGGTGTTG 
GCTGAGCTTGAGGCCCTGGAGAGACGTGTGCACAAAATGTGACCTGAGGCCCTAGTCTAGCAAGAGG 
ACATAGCACCCTCATCTGGGAATAGGGAAGGCACCTTGCAGAAAATATGAGCAATTTGATATTAACT 


AACATCTTCAATGTGCCATAGACCTTCCCACA 




ORF Start: ATG at 63 J 


ORF Stop: TGA at 1983 





SEQ ID NO: 192 f640 aa |MW at 70688.2kD 


NOV48a, 
CG56234-01 
Protein Sequence 


MAALYRPGLRLNWHGLSPLGWPSCRSIQTLRVLSOT 

AENTATLTLLEQQGLIRKLPKYNNCWLARTDPK^ 

MSPADFQRAVDERFPGCMQGRTimrLPFSMGPVGSPLSRIGVQLT^ 

LGDGDFVKCLHSVGQPLTGQGEPVSQWPCNPEKTLIGHVPDQREIISFGSGYGGNSLLGKKCFALRI 
ASRLARDEGWLAEHML I LG ITS P AGKKR YVAAAFP S AC GKTNIiAMMR PAL PGWKVEC VGDD IAWMRF 
DSEGRLRAINPENGFFGVAPGTSATTNPNAMATIQSNTIFTWAETSDGGVYWEGIDQPLPPGVTVT 
SV^GKPTOSGDKEPCAHPNSRFCAPARQC PIMDPAWEAPEGVPIDAI IFGGRRPKGVPLVYEAFNWR 
HGVFVG S AMR S E S TAAAEHKGKI UMGHDPF AMRP F FG YNFGHYL EHWL SMEGRKGAQL PRI FHVNWFR 
RDEAGHFLWPGFGENARVLDWICRRLEGEDSARETPIGLVPKEGAIiDLSGLRAIDTTQLFSLPKDFW 
EQEVRDIRSYLTEQVNQDLPKEVLAELEALERRVHKM 



( 





SEQ ID NO: 193 2069 bp _J 


NOV48b, 
CG56234-02 
DNA Sequence 


CCCGCCTTCCATACCTCCCCGGCTCCGCTCGGTTCCTGGCCACCCCGCAGCCCCTGCCCAGGTGCCA 


TGGCCGCATTGTACCGCCCTGGCCTGCGGCTTAACTGGCATGGGCTGAGCCCCTTGGGCTGGCCATC 

ATGCCGTAG(^TCCAGACCCTCCGAGTGCTTAGTGGAGATCTGGGCCAGCTTCC(^CTGGCATTCGA 

GATTTTGTAGAGCACAGTGCCCGCCTGTGCCAACCAGAGGGCATCCACATCTGTGATGGAACTGAGG 

CTGAGAATACTGCCACACTGACCCTGCTGGAGGAGCAGGGCCTCATCCGAAAGCTCCCCAAGTAG^ 

TAACTGCTGGCTGGCCCGCACAGACCCCAAGGATGTGGCACGAGTAGAGAGCAAGACGGTGATTGTA 

ACTCCTTCTCAGCGGGACACGGTACCACTCCCGCCTGGTGGGGCCTGTGGGCAGCTGGGCAA 

TGTCCCCAGCTGATTTCCAGCGAGCTGTGGATGAGAGGTTTCCAGGCTGCATGCAGGGCCGCACCAT 

GTATGTGCTTCCATTCAGCATGGGTCCTGTGGGCTCCCCGCTGTCCCGCATCGGGGTGCAGCTCACT 

GACTCAGCCTATGTGGTGGCAAGCATGCGTATTATGACCCGACTGGGGACACCTGTGCTQ?CAGGCCC 

TGGGAGATGGTGACTTTGTCAAGTGTCTGCACTCCGTGGGCCAGCCCCTGACAGGACAAGGGGAGCC 

AGTGAGCCAGOMSGCCGTGCAACCCAGAGAAAACCCTGATTGGCCACGTGCCCGACCAGCGGGAGATC 

ATCTCCTTCGGCAGCGGCTATGGTGGCAACTCCCTGCTGGGCAAGAAGTGCTTTGCCCTACGCATCG 

CCTCTCGGCTGGCCCGGGATGAGGGCTGGCTGGCAGAGCACATGCTGATCCTGGGCATCACCAGCCC 

TGCAGGGAAGAAGGCGCTATGTGCAGCCGCCTTCCCTAGTGCCTGTGGCAAGACCAACCTGGCTATG 

ATGCGGCCTGCACTGCCAGGCTGGAAAGTGGAGTGTGTGGGGGATGATATTGCTTGGATGAGGTTTG 

ACAGTGAAGGTCGACTCCGGGCCATCAACCCTGAGAACGGCTTCTTTGGGGTTGCCCCTGGTACCTC 

TGCCACCACCAATCCCAACGCCATGGCTACAATCCAGAGTAACACTATTTTTACCAATGTGGCTGAG 

ACCAGTGATGGTGGCGTGTACTGGGAGGGCATTGACCAGCCTCTTCCACCTGGTGTTACTGTGACCT 

CCTGGCTGGGCAAACCCTGGAAACCTGGTGACAAGGAGCCCTGTGCACATCCCAACTCTCGATTTTG 

TGCCCCGGCTCGCCAGTGCCCCATCATGGACCCAGCCTGGGAGGCCCCAGAGGGTGTCCCCATTGAC 

GCCATCATCTTTGGTCGCCGCAGACCGAAAGGGAAGATCATCATGCACGACCCATTTGCCATGCGGC 

CCTTTTTTGGCTACAACTTCGGGCACTACCTGGAACACTGGCTGAGCATGGAAGGGCGCAAGGGGGC 

CCAGCTGCCCCGTATCTTCCATGTCAACTGGTTCCGGCGTGACGAGGCAGGGCACTTCCTGTGGCCA 

GGCTTTGGGGAGAATGCTCGGGTGCTAGACTGGATCTGCCGGCGGTTAGAGGGGGAGGACAGTGCCC 
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GMAGACACCCATTGGGCTGGT^ 

CACCACTCAGCTGTTCTCCCTCCCCAAGGACTTCTGGGAACAGGAGGTTCGTGACATTCGGAGCTAC 
C TG ACAGAGCAGGTC AACC AGGATCTGCC CAAAGAGGTGTTGGCTGAGC T TGAGGCCCTGGAG AGAC 
GTGTGC ACAAAATGTGACC TGAGGCC TAGTCTAGCAAG AGGACATAG C ACCCTC ATC TGGG AATAGG 


GAAGGCACCTTGCAGAAAATATGAGCAATTGATATTAACTAACATCTTCAATGTGCCATAGACCTTP 


CCACAAAGACTGTCCAATAATAAGAGATGCTTATCTATTTTAAAAAAAAAAAAAAAAAA 




ORF Start: ATG at 67 J JORF Stop: TGA at 1891 





SEQ ID NO: 194 j 608 aa jMW at 67027.1kD 


NOV48b, 
CG56234-02 
Protein Sequence 


maalyrpglrlnwhgls plgwpscrs iqtlrvlsgdlgqlptgirdfvehsarlcqpegihicdgte 
aentatltlleqqglirklpkyi^cwlartdpkdvarveskwivtpsqrdwplppggacgqlg™ 
mspadfqravderfpgcmqgrtmyvlpfsmgpvgsplsrigvqltdsaywasmrimtrlgtpvlqa 
lgdgdfvkclhsvgqpltgqgepvsqwpcnpektlighvpdqreiisfgsgyggnsllgkkcfalri 
asrlardegwlaehmlilgitspagkkalcaaafpsac^^ 

dsegrlrainpengffgvapgtsattnpnamatiqsntiftnvaetsdggvywegidqplppgvtvt 
swlgkpwkpgdkepcahpnsrfcaparqcpimdpaweapegvpida i ifggrrpkgki imhdpfamr 
pffgynfghylehwlsmegrkgaqlprifhvnwfrrdeaghflwpgfgenarvldwicrrlegedsa 

RETPIGLVPKEGALDI1SGLRAIDTTQLFSLPKDFWEQEVRDIRSYLTEQVNQDLPKEVI1AELEALER 
RVHKM 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 48B. 



Table 48B. Comparison of NOV48a against NOV48b. 


Protein Sequence 


NOV48a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV48b 


L.640 
1..608 


577/640(90%) 
577/640(90%) 



Further analysis of the NOV48a protein yielded the following properties shown in 
Table 48C. 



Table 48C Protein Sequence Properties NOV48a 


PSort analysis: 


0.6402 probability located in microbody (peroxisome); 0.3000 probability 
located in nucleus; 0.2412 probability located in lysosome (lumen); 0.1000 
probability located in mitochondrial matrix space 


SignalP analysis: 


No Known Signal Sequence Predicted 
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, • A search of the NOV48a protein against the GeneSe<|tl^ay,^p^ifetarV :! 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 48D. 



Table 48D. Get 


teseq Results for NOV48a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV48a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAY80296 


Human mitochondria] 
phosphoenolpyruvate 
carboxykinase SEQ ID NO: 1 
- Homo sapiens, 640 aa. 
[US6030837-A, 
29-FEB-2000] 


1..640 
1..640 


634/640 (99%) 
634/640(99%) 


0.0 


AAB71890 


Mouse PEPCK-cytosolic 
protein - Mus musculus, 622 
aa. [US6187545-B1, 
13-FEB-2001] 


31. .640 
14..622 


440/610 (72%) 
519/610(84%) 


0.0 


AAB71880 


Human PEPCK-cytosolic 
protein - Homo sapiens, 622 
aa. iuooi<$/34j-Jo1, 
13-FEB-2001] 


31..640 
14..622 


438/610 (71%) 
517/610 (83%) 


0.0 


ABB65318 


Drpsophila melanogaster 
polypeptide SEQ ID NO 
22746 -Drosophila 
melanogaster, 647 aa. 
[WO200171042-A2, 
27-SEP-2001] 


27..640 
35..647 


394/616 (63%) 
480/616 (76%) 


0.0 


ABB65322 


Drosophila melanogaster 
polypeptide SEQ ID NO 
22758 -Drosophila 
melanogaster, 638 aa. 
[WO200171042-A2, 
27-SEP-2001] 


30..640 
29..638 


402/613 (65%) 
469/613 (75%) 


0.0 



In a BLAST search of public sequence datbases, the NOV48a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 48E. 



Table 48E. Public BLASTP Results for NOV48a 



□ 
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Protein 

Accession 

Number 


Protein/Organism/Length 


NOV48a 
Residues/ 
Match 
Residues 


! ""-far-fl-v- 1 'uamn 

Identities/ 
Similarities for 
the Matched 
Portion 


~ . ** C uul* <«JL uSfc .Af 

Expect 
Value 


QI6822 


Phosphoenolpyruvate 
carboxykinase, mitochondrial 
precursor [GTP] (EC 4.1.132) 
(Phosphoenolpyruvate 
carboxylase) (PEPCK-M) - 
Homo sapiens (Human), 640 
aa. 


L.640 
1..640 


635/640 (99%) 
635/640(99%) 


0.0 


S69546 


phosphoenolpyruvate 
carboxykinase (GTP) (EC 
4.1.132) precursor, 
mitochondrial - human, 640 
aa. 


1..640 
1..640 


634/640 (99%) 
634/640 (99%) 


0.0 


Q91Z10 


Similar to 

phosphoenolpyruvate 
carboxykinase 2 

musculus (Mouse), 640 aa. 


1..640 
1..640 


590/640 (92%) 
609/640 (94%) 


0.0 


Q8R3X7 


Similar to RIKEN cDNA 
9130022B02gene-Mus 
musculus (Mouse), 535 aa 
(fragment). 


106..640 
1..535 


504/535 (94%) 
518/535 (96%) 


0.0 


P07379 


Phosphoenolpyruvate 
carboxykinase, cytosolic 
[GTP] (EC 4.1.132) 
(Phosphoenolpyruvate 
carboxylase) (PEPCK-C) - 
Rattus norvegicus (Rat), 622 
aa. 


31..640 
14..622 


441/610 (72%) 
520/610(84%) 


0.0 



PFam analysis predicts that the NOV48a protein contains the domains shown in the 
Table 48F. 



Table 48F. Domain Analysis of NOV48a 


Pfam Domain 


NOV48a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


PEPCK 


46..640 


445/608 (73%) 
591/608(97%) 


0 
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Example 49. 

The NOV49 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 49A. 



Table 49A. NOV49 Sequence Analysis 




SEQIDNO:195 |l202bp | 


NOV49a, 
CG56836-01 
DNA Sequence 


TGTAAGCGATCTGGTTCCCACCTCAGCCTCCCGAGTAGTGTCTTCAGGCCTATGGAGAGCAGCTTGC 


GTGGGCTGGGCCTGCAGTACCTGGTTTGCATAGATGATTGGCAGGTGGATCTAGGATCCGGCTTCC^ 


A£ATGTGGCAGCTCTGGGCCTCCCTCTGCTGCCTGCTGGTGTTGGCCAATGCCCGGAGCAGGCCCTC 
TTTCCATCCC C TGTCGGATGAGC TGGTC AACTATGTCAACAAACGGAATAC C ACGTGGCAGGCCGGG 
CACAAC TTCTACAACGTGGACATGAGCTACTTGAAGAGGC TATGTGGTAC C TTCCTGGGTGGGCCCA 
AGCCACCCCAGAGAGTTATGTTTACCGAGGACCTGAAGCTGCCTGCAAGCTTCGATGCACGGGAACA 
ATGGCCACAGTGTCCCACCATCAAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCTGGGCCTTC 
GGGGCTGTGGAAGCOATCTCTGACCGGATCTGCATCCACACCAATGCG»CGTCAGCGTGGAGGTGT 
CGGCGGAGGACCTGCTCACATGCTGTGGCAGCATGTGTGGGGACGGCTGTAATGGTGGCTATCCTGC 
TGAAGCTTGGAACTTCTGGACAAGAAAAGGCCTGGTTTCTGGTGGCCTCTATGAATCCCATGTAGGG 
TGCAGACCGTACTCCATCCCTCCCTGTGAGCACCACGTCAACGGCTCCCGGCCCCCATGCACGGGGG 
AGGGAGATACCCCCAAGTGTAGCAAGATCTGTGAGCCTGGCTACAGCCCGACCTACAAACAGGACAA 
GCAC TACGGATACAAT TCCTAC AGCGTCTCCAATAGCGAG AAGGACATCATGGCCGAGATCTACAAA 
AACGGCCCCGTGGAGGGAGCTTTCTC TGTGTATTCGGACTTCCTG CTCTACAAGTCAGGAGTGTAC C 
AACACGTCACCGGAGAGATGATGGGTGGCCATGCCATCCGCATCCTGGGCTGGGGAGTGGAGAATGG 
CAC ACCC TACTGGC TGGTTGCC AACTCCTGGAAC ACTGACTGGGGTGAC AATGGCTTC TTTAAAATA 
CTCAGAGGACAGGATCACTGTGGAATCGAATCAGAAGTGGTGGCTGGAATTCCACGCACCGATCAGT 
ACTGGGAAAAGATCTAATCTGCCGTGGGCCTGTCGTGCCAGTCCTGGGGGCGAGATCGGGGTA 




ORF Start: ATG at 137 | ]ORF Stop: TAA at 1 154 





SEQ ID NO: 196 


|339aa |MW at 37821.3kD 


NOV49a, 
CG56836-01 
Protein Sequence 


MWQLWASLCCLLVIJ^ARSRPSFHPLSDEIiV^ 
PEQRVMFTEDLKLPASFDAREQWPQCPTIKEIRDQGSCGSC^^ 

AEDLLTCCGSMCGDGCNGGYPAEAWNFWTRKGLVSGGLYESHVGCRPYSIPPCEHHVNGSRPPCTGE 
GDTPKC SKICEPGYSPT YKQDKHYGYNSYS VSNSEKDIMAE I YKNG PVEGAFSVYSDFLLYKSGVYQ 
HVTGEMMGGHAIRILGWGVBNGTP YWLVANSWNTDWGDNGFFK I LRGQDHCG IESEVVAGI PRTDQ Y 
WEKI 





SEQ ID NO: 197 j723 bp } 


NOV49b, 
CG56836-02 
DNA Sequence 


ACATGGTGGATCTAGGATCCGGCTTCCAACATGTGGC^GCTCTGGGCCTCCCTCTGCTGrrTnrTr^ 


tgttggccaatgcccggagcaggccx:tctttccatcccctgtcggatgagctggtcaactatgtcaa 
caaacggaataccacgtggc aggc cgggc acaac ttc tacaacgtggacatgagctacttgaagagg 
ctatgtggtaccttcctgggtgggcccaagccaccccagagagttatgtttaccgaggacctgaagc 
tgcctgc aagc ttcgatgcacgggaacaatggccacagtgtccc ac c atcaaagagatcagagacca 
gggctcctgtggctcc tgc tgggc cttcggggctgtggaagccatctctgaccggatctgcatcc ac 
acc aatgcgcacgtc agcgtggaggtgtcggcggaggacctgctc acctgc c tgctc tacaagtcag 
gagtgtaccaacacgtcaccggagagatgatgggtggccatgccatccgcatcctgggctggggagt 
ggagaatggcacaccctactggctggttgccaactcctggaacactgactggggtgacaatggcttc 
tttaaaatactcagaggacaggatcactgtggaatcgaatcagaagtggtggctggaattccacgca 

CCGATCAGTACTGGGAAAAGATCTAATCTGCCGTGGGCCTGTCGTGCCAAACC 




ORF Start: ATG at 31 j |0RF Stop: TAA at 694 
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SEQ ID NO: 198 221 aa |MW at 24974.2kD 


NOV49b, 
CG56836-02 
Protein Sequence 


MWQLTOSLCCLLVLANARSRPSFHFLSDELV^ 
PPQRVMPTEDLKLPASFDAREQWPQCPTiKEIRDQGSCGSCWAFG 

AEDLLTCLL YKSGVYQHVTG EMMGGHAI R I LGWGVENGT P YWLVANS WNTDWGDNGF FKI LRGQDHC 





SEQ ID NO: 199 Jl028bp j 


NOV4gc, 
CG56836-03 
DNA Sequence 


TQTA&GCGATCffGGTTCCCACCTC^ 


y.TGGGCTGGGCC TGC AGTACC TGGTT TGC ATAGATG ATTGGCAGGTGGATC TAGGATC CGGCTTCr A 
ACATGTGGCAGCTCTGGGCCTCCCTCTGCTGCCTGCTGGTGTTGGCCAATGCCCGGAGCAGGCCCTC 
TTTCCATCCCCTGTCGGATGAGCTGGTCAACTATGTCAACAAACGGAATACCACGTGGCAGGCCGGG 
CAGAACTTCTACAACGTGGACATGAGC TACTTGAAGAGGC TATGTGGTACCTTCCTGGGTGGGCCCA 
AGCCACCCC AGAGAGT TATGTTTACCGAGGACCTGAAGCTGCC TGC AAGC TTCGATGC ACGGGAAC A 
ATGGCCACAGTGTCCCACCATCAAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCTGGGCCTTC 
GGGGCTGTGGAAGCCATCTC TGACCGGATCTGC ATC CACACCAATGCGCACGTCAGCGTGGAGGTGT 
CGGCGGAGGACCTGC TCAC ATGCTGTGGC AGCATGTGTGGGGACGGCTGTAATGGTGGC TATCCTGC 
TGAAGCTTGGAACTTCTGGACAAGAAAAGGCCTGGTTTCTGGTGGCCTCTATGAATCCAATAGCGAG 
AAGGACATCATGGCCGAGATC TACAAAAACGGC CCCGTGGAGGGAGCTTTCTCTGTGTATTCGGAC T 
TCCTGCTCTACAAGTCAGGAGTGTACCAACACGTCACCGGAGAGATGATGGGTGGCCATGCCATCCG 
C^TCCTGGGCTGGGGAGTGGAGAATGGCACACCCTACTGGCTGGTTGCCAACTCCTGGAACACTGAC 
TGGGGTGACAATGGCTTCTTTAAAATACTCAGAGGACAGGATCACTGTGGAATCGAATCAGAAGTGG 
TGGCTGGAATTCCACGCACCGATCAGTACTGGGAAAAGATCTAATCTGCCGTGGGCCTGTPRTRrrA 
GTCCTGGGGGCGAGATCGGGGTA 




ORF Start: ATG at 137 | jORF Stop: TAA at 980 





SEQ ID NO: 200 |281 aa jMW at 31423.2kD 


NOV49c, 
CG56836-03 
Protein Sequence 


MWQLWASLCCLLVLANARSRPSFOT 

PPQRVMFTEDLKI* PAS FDAREQWPQCPTIKE IRDQGSCX5SCWAFGAVEA ISDRICIHTNAHVSVEVS 

AEDLLTCCGSMCGDGCNGGYPAB^nSTFWTRKGLVSGGLYESNSEKDIMAEIYK^ 

LL YKSGVYQHVTG EMMGGHAIRILGWGVENGTP YWLVANSWNTDWGDNGFFK ILRGQDHC 

AG I PRTDQYWEK I 





SEQ ID NO: 201 |l028bp { 


NOV49d, 
CG56836-04 
DNA Sequence 


TCTAAGCGATCTGGTTCCCACCTCAGCCTCCCGAGTAGTGTCTTCAGGCCTATGGAGAGCAGCTTGC 
GTGGGCTGGGCCTGCAGTACC TGGTTTGCATAGATGATTGGC AGGTGGATCTAGGATCCGGCTTCf! A 


^atgtggcack:tctggck:ctccctctgctgcctgctggtgttggccaatgcccggagcaggccctc 

TTTCCATCCCCTGTCGGATGAGCTGGTCAACTATGTCAACAGACGGAATACCACGTGGCAGGCCGGG 
CACAAC TTCTAC AACGTGGAC ATGAGC TACTTGAAGAGGCTATGTGGTACCTTCCTGGGTGGGCCCA 
AGCC ACCCCAGAGAGTTATGTTTACCGAGGACCTGAAGCTGCC TGC AAGC TTCGATGCACGGGAACA 
ATGGCCACAGTGTCCCACCATCAAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCTGGGTTTCT 
GGTGGCCTCTATGAATCCCATGTAGGGTGCAGACCGTACTCCATCCCTCCCTGTGAGCACCACGTCA 
ACGGCTCCCGGCCCCCATGCACGGGGGAGGGAGATACCCCGAAGTGTAGCAAGATCTGTGAGCCTGG 
CTACAGCCCGACCTACAAACAGGACAAGCACTACGGATACAATTCCTACAGCGTCTCCAATAGCGAG 
AAGGACATCATGGCCGAGATC TACAAAAACGGCC CCGTGGAGGGAGCTTTCTCTGTGTATTCGGACT 
TCCTGCTCTACAAGTCAGGAGTGTACCAACACGTCACCGGAGAGATGATGGGTGGCCATGCCATCCG 
CATCCTGGGCTGGGGAGTGGAGAATGGCACACCCTACTGGCTGGTTGCCAACTCCTGGAACACTGAC 
TGGGGTGACAATGGC TTCT TTAAAATACTCAGAGGACAGGATCACTGTGG AATCGAATCAGAAGTGG 
TGGCTGGAATTCCACGCACCGATCAGTACTGGGAAAAGATCTAATCTGCCGTGGGCCTGTCGTGCCA 
GTCCTGGGGGCGAGATCGGGGTA 
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jORF Start: ATG at 



137 





SEQIDNO:202 281 aa 


MWat31732.5kD 


NOV49d, 
CG56836-04 
Protein Sequence 


MWQLWASLCCLLVLANARSRPSFHPLSDE^^ 

PPQRVMFTEDLKL PASFDAREQWPQC PT IKE IRDQGSCGSCOTSGGIi YESHVGCRP YS I p PCEHHVN 
GSRPPCTGEGDT PKC SK ICE PGYSPTYKQDKHYGYNS YS VSNSEKDIMAE I YKNGPV32GAF S VYSDF 
LL YK SG VYQHVTG EMMGGHA I R I LG WGVENG T PYWLVANS WNTDWGDNGF FK I LRG QDHCG I E S E W 
AGIPRTDQYWEKI 





SEQ ED NO: 203 340 bp | 


NOV49e, 
247856403 DNA 
Sequence 


AGGCTCCGCGGCCGCCCCCTTCACCGGATCCCTGCCTGCAAGCTTCGATGCACGGGAACAATGGCCA 
CAGTGTCCCACCATCAAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCTGGGCCTTCGGGGCTG 
"IGGAAGCCATCTCTGACCGGATCTGCATCCACACC^ 

GGACC TGCTCACATGC TGTGGCAGCATGTGTGGGGACGGCTGTAATGGTGGCTATCCTGC TGAAGCT 
TGGAACTTC TGG ACAAGAAAAGGCCTGGTTTCTGGTGGCCTC TATCTCGAGGGCAAGGGTGGGCGCG 
CCGAC. 




ORF Start: at 2 JORF Stop: end of sequence 





SEQ ID NO: 204 |ll3 aa JmW at 1 18340kD 


NOV49e, 
247856403 
Protein Sequence 


GSAAAPFTGSLPASFDAREQWPQCPTIKEIREQGSCGSCWAFGAVEAISDRICIHTNAHVSVEVSAE 
DliTCCGSMCGTDGCNGGYPAEAWNFWTRKGLVSGGLYIiEGKGGRAD 





SEQ ID NO: 205 j376bp 


NOV49f, 
247856434 DNA 
Sequence 


AGGCTCCGCGGCCGCCCCCTTCACCGGATCCTCCAA^ 

AAAAACGGCCCCGTGGAGGGAGCTTTCTCTGTGTATTCGGACTTCCTGCTCTACAAGTCAGGAGTCT 
ACCAACACGTCACCGGAGAGATGATGGGTGGCCATGCCATCCGCATCCTGGGCTGGGGAGTGGAGAA 
TGGCACACCCTACTGGCTGGTTGCCAACTC CTGGAACACTGACTGGGGTGACAATGGCTTC TTTAAA 
ATACTCAGAGGACAGGATCACTGTGGAATCGAATCAGAAGTGGTGGCTGGAATTCCACGCACCGATC 
AGTACTGGGAAAAGATCCTCGAGGGCAAGGGTGGGCGCGCC 




ORF Start: at 2 |oRF Stop: end of sequence 





SEQ ID NO: 206 125 aa |MW at 1 3666.1 kD 


NOV49f, 
247856434 
Protein Sequence 


GSAAAPFTGSSNSEKDIMAEIYKNGPVEGAFSVYSDFLLYKSGVYQHVTGEMMGGHAIRILGWG^ 
GTP YWLVANSWNTDWGDNGFFK I LRGQDHCGIESEWAG I PRTDQYWEKILEGKGGRA 



286 



WO 03/029424 



PCT/US02/31373 




SEQIDNO: 207 



^^^^ 



|ORF Start: at 2 



}ORF Stop: end of sequence 



NOV49g, 
247856497 
Protein Sequence 



SEQIDNO: 208 



|l91 



MWat20877.5kD 



™AHVSWSAEDLLTCrcSMCGDGCNGOT^ 



:ggra 



NOV49h, 
247856493 DNA 
Sequence 



SEQIDNO: 209 



|590bp 



GTGTTGK3CCAATGCCCX:C^GCA0K3CCCTCTTTCCM^ 

acaaacggaatagcacg^aggccgggcacaacttctaSctg^ 

CTTOTGGCCTCTATCT^^ 



ORF Start: at 2 



end of seqm 



NOV49h, 
247856493 
[Protein Sequence 



SEQIDNO: 210 



J197 



aa 



JMWat21367.01cD 



tnahvsve^saedlltccgsmcgdgcnggypaeawnfwtrkglvsgglylegtcg 



[SEQIDNO: 211 J55I bp f~| 

lAG^TCCGC^CGCCCCCTTCacC^ATCCCGGAGCAGGCCCT C^TTCCATCCCCTOT ^ 



NOV49i, 



(551 bp 



287 



> WO 03/029424 



PCT/US02/31373 



247856574 DNA 
Sequence 


CTGGTCAMTATCTCAAG^ 

TGAGCTACTTGAAGAGGCTATGTGGTACCTTCCTGGG0X3C3GCCCAAGCCACCCCAGAGAGTTATGTT 
TACCGAGGACCTGAAGCTGC C TGCAAGCTTCGATGC ACGGGAACAATGGCCACAGTGTCCCACCATC 
AAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCTGGGCCTTCGGGGCTGTGGAAGCCATCTCTG 
ACCGGATCTGCATCCACACCAATGCGCACGTCAGCGTGGAGGTGTCGGCGGAGGACCTGCTCACATG 
CTGTGGCAGCATGTGTGGGGACGGCTGTAATGGTGGCTATCCTGCTGAAGCTTGGAACTTCTGGACA 
AGAAAAGGC CTGGTTTC TGGTGGCCTCTATCTCGAGGGCAAGGGTGGGCGCCC CGAC CC AGC TTTCC 
CGTACAAAGC TGGCA 




ORF Start at 2 ORF Stop: end of sequence 





SEQ ID NO: 212 184 aa JmW at 19933.2kD 


NOV49i, 
247856574 
Protein 
Sequence 


GSAAAPFTGSRSRPSFHPLSDELVNYVNKROT^ 

TEDLKLPASFDAREQWPQCPTIKE IRDQGS CGSCWAFGAVEAI SDR ICIHTNAHVSVEVSAEDLLTC 
CGSMCGDGCNGGYPAEAWNFWTRKGLVSGGLYLEGKGGRPDPAFPYKAGX 





SEQ ID NO: 213 523 bp | 


NOV49j, 
247856545 DNA 
Sequence 


AGGCTCCGCGGCCGCCCCCTTCACCGGATC 

CTGGTCAACTATGTCAACAAACGGAATACCACGTGGCAGGCCGGGCACAACTTCTACAACGTGGACA 
TGAGCTACTTGAAGAGGCTATGTGGTACCTTCCTGGGTGGGCCCAAGCCACCCCTGAGAGTTATGTT 
TACCGAGGACCTGAAGCTGCCTGCAAGCTTCGATGCACGGGAACAATGGCCACAGTGTCCCACCATC 
AAAGAGATCAGAGACCAGGGC TCCTGTGGCTCCTGC TGGGCCTTCGGGGCTGTGGAAGCCATCTCTG 
ACCGGATCTGCATCCACACCAATGCGCACGTCAGCGTGGAGGTGTCGGCGGAGGACCTGCTCACATG 
CTGTGGCGGCATGTGTGGGGACGGCTGTAATGGTGGCTATCCTGCTGAAGCTTGGAACTTCTGGACA 
AGAAAAGGCCTGGTTTCTGGTGGCCTCTATCTCGAGGGCAAGGGTGGGCGCGCC 




ORF Start: at 2 |ORF Stop: end of sequence 





SEQ ID NO: 214 jl74 aa |mW at 18915.1kD 


NOV49j, 
247856545 
Protein 
Sequence 


GSAAAPFTGSRSRPSFHPLSDELVKYVNKRNTTWQAGHNFYira 

TEDLKLPASFDAREQWPQCPT IKEIRDQGSCGSCWAFGAVEAISDRIC IHTNAHVSVEVSAEDLLTC 
CGGMCGDGCWGGYPAEAWNFWTRKGLVSGGLYLEGKGGRA 





SEQ ID NO: 215 |l036bp J 


NOV49k, 
275480714 DNA 
Sequence 


CACCCTCGAGATGTGGCAGCTCTGGGCCTCCCTCTGCTGCCTGCTGGTGTTGGCCAATGCCCGGAGC 
AGGCCCTC TTTC CATCCC CTGTCGGATGAGCTGGTCAACTATGTCAACAAACGGAATACCACGTGGC 
AGGCCGGGCACAACTTCTACAACGTGGACATGAGCTACTTGAAGAGGCTATGTGGTACCTTCCTGGG 
TGGGCCCAAGCCACCCCAGAGAGTTATGTTTACCGAGGACCTGAAGCTGCCTGCAAGCTTCGATGCA 
CGGGAACAATGGCCACAGTGTCCCACCATCAAAGAGATCAGAGACCAGGGCTCCTGTGGCTCCTGCT 
GGGCCTTCGGGGCTGTGGAAGCCATCTCTGACCGGATCTGCATCCACACCAATGCGCACGTCAGCGT 
GGAGGTGTCGGCGGAGGACCTGCTCACATGCTGTGGCAGCATGTGTGGGGACGGCTGTAATGGTGGC 
TATCCTGCTGAAGCTTGGAACTTCTGGACAAGAAAAGGCCTGGTTTCTGGTGGCCTCTATGAATCCC 
ATGTAGGGTGCAGACCGTACTCCATCCCTCCCTGTGAGCACCACGTCAACGGCTCCCGGCCCCCATG 
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CACGGGGGAGGGAGATACCCCCAAGTGTAGCAAGATCTGTGAGCCT^ 

CAGGACAAG CAC TACGGATAC AATTC CTACAGCGTC TCC^TAGCGAGAAGGAC ATCATGGCCGAGA 
TCTACAAAAACGGCCCCGTGGAGGGAGCTTTCTCTGTGTATTCGGACTTCCTGCTCTACAAGTCAGG 
AGTGTACCAACACGTCACCGGAGAGATGATGGGTGGCCATGCCATCCGCATCCTGGGCTGGGGAGTG 
GAGAATGGC ACACCCTAC TGGCTGGTTGC CAACTCC TGGAACACTGAC TGGGGTGACAATGGCTTCT 

TTAAAATACTCAGAGGACAGGATCACTGTGGAATCGAATCAGAAGTGGTGGCTGGAATTCCACGCAC 
CGATCAGTACTGGGAAAAGATCGTCGACGGC 



ORF Start: at 2 



|ORF Stop: end of sequence 





SEQ ID NO: 216 . ^345 aa ~|mW at 38435.9kD 


NOV491C, 
275480714 
Protein Sequence 


TLEIWQLWASLCCLLVLANARSRPSFHPLSDEL^^ 

GPK PPQRVMFTEDLKL PAS FDAREQWPQC PTIKBIRDQGS CGSCWAFGAVEAISDRI C IHTNAHVS V 

EVSAEDLLTCCGSMCX3DGCNGGYPAEAWNFWTRKG 

TGEGDTPKCSKICEPGTTSPTYKQDKHYG™^ 

WQHVTGEX^GGHAI RI LGWGVENGTP YTOVANSWNTD^^ I ES EWAGI PRT 
DQYWEKIVDG ~~ j 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 49B. 



Table 49B. Comparison of NOV49a against NO V49b through NOV49k. 



Protein Sequence 


NOV49a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV49b 


L.141 
1..141 


141/141 (100%) 
141/141 (100%) 


NOV49c 


1..176 
1-176 


175/176 (99%) 
176/176 (99%) 


NOV49d 


1..339 
1..281 


279/339 (82%) 
280/339 (82%) 


NOV49e 


80..180 
I1..111 


96/101 (95%) 
96/101 (95%) 


NOV49f 


233..339 
11..117 


107/107 (100%) 
107/107 (100%) 


NOV49g 


1-180 
11..190 


175/180(97%) 
175/180(97%) 


NOV49h 


1..180 
11-190 


173/180 (96%) 
174/180(96%) 


NOV49i 


17-181 
10-174 


159/165 (96%) 
160/165 (96%) 


NOV49j 


17-180 
10-173 


144/164(87%) 
145/164 (87%) 
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NOV49k 


1..339 






4..342 


J 339/339 (100%) 



Further analysis of the NOV49a protein yielded the following properties shown ii 
Table 49C. 



Table 49C. Protein Sequence Properties NOV49a 


PSort analysis: 


0.3700 probability located in outside; 0.1900 probability located in lysosome 
(lumen); 0.1376 probability located in microbody (peroxisome); 0.1000 
probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


Cleavage site between residues 18 and 19 



A search of the NOV49a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 49D. 
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Table 49D. Geneseq Results for NOV49a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent #, Date] 


NOV49a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAR90616 


Anti-procathepsin B 
monoclonal antibody - Homo 
sapiens, 339 aa. 
[JP07309900-A, 
28-NOV-1995J 


1..339 
1..339 


338/339 (99%) 
339/339 (99%) 


0.0 


AAB53470 


Human colon cancer antigen 
protein sequence SEQ ID 
NO:1010 -Homo sapiens, 
344 aa. [WO200055351-A1,' 
21-SEP-2000] 


1..339 
6..344 


338/339 (99%) 
338/339(99%) 


0.0 


ABP41147 


Human ovarian antigen 
HOFMP73, SEQ ID 
NO:2279 - Homo sapiens, 
346 aa. [WO200200677-A1, 
03-JAN-2002] 


1..339 
8..346 


290/339 (85%) 
317/339 (92%) 


0.0 


ABB06116 


Human NS protein sequence 
SEQ ID NO:208 - Homo 
sapiens, 273 aa. 
[WO200206315-A2, 
24-JAN-2002] 


1..267 
1..267 


266/267 (99%) 
266/267 (99%) 


e-167 


ABB65378 


Drosophila melanogaster 
polypeptide SEQ ID NO 
22926 - Drosophila 
melanogaster, 340 aa. 
[WO200171042-A2, 
27-SEP-2001] 


13..33I 
13..339 


190/330 (57%) 
232/330 (69%) 


e-113 



In a BLAST search of public sequence datbases, the NOV49a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 49E. 



Table 49E.Pu 


blic BLASTP Results for NOV49a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV49a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Portion 


Expect 
Value 



291 



»■ WO 03/029424 



PCT/US02/31373 



P07858 


Cathepsin B precursor (EC 
3.4.22.1) (Cathepsin Bl) (APP 
secretase) - Homo sapiens 
(Human), 339 aa. 


r 1 

1..339 
1..339 


338/339 (99%) 
339/339 (99%) 


»' ™!t .JL Jill 

0.0 


KHBOB 


cathepsin B (EC 3.4.22.1) 
precursor - bovine, 335 aa. 


1..335 
1..335 


280/335 (83%) 
307/335 (91%) 


e-180 t 


XXY7AQQ 
rU/Ooo 


Cathepsin B precursor (EC 
3.4.22.1) -Bos taurus 
(Bovine), 335 aa. 


1..335 
1..335 


279/335 (83%) 
307/335 (91%) 


e-180 


P00787 


Cathepsin B precursor (EC 
3.4.22 1) (Cathensin TIT* 
(RSG-2) - Rattus norvegicus 
(Rat), 339 aa. 


1..336 

1 


265/336 (78%) 
2^9/336 (88%) 


e-175 


P10605 


Cathepsin B precursor (EC 
3.4.22.1) (Cathepsin Bl)- 
Mus musculus (Mouse), 339 
aa. 


1.336 
1..336 


267/336 (79%) 
297/336 (87%) 


e-174 



PFam analysis predicts that the NOV49a protein contains the domains shown in the 
Table 49F. 



Table 49F. Domain An 


alysisofNOV49a 


Pfam Domain 


NOV49a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


Peptidase_Cl 


80..329 


112/344 (33%) 
218/344 (63%) 


1.3e-117 



Example 50. 

The NOV50 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 50A. 



Table 50A. NO V50 Sequence Analysis 


NOV50a, 


SEQIDNO:217 |960bp | 

CCCGTCCGAGCCCCGGCCCCAAGTA^ 


CG57284-01 
DNA Sequence 


TAAGTGCCTCTTTGCATAGCAC^^ 

^TQGCGGGTCGGGGAGGCGCACGACGACCCAATGGACCAGCTGCTGGGAACAAGATCTGTCMTTT 
AAGCTGGTTCTGCTGGGGGAGTCTGCGGTAGGCAAATCCAGCCTCGTCXITCCGCTTTGTCAAGGGAC 
AGTTTCACGAGTACCAGGAGAGCACAATTGC^GCGGCCTTCCTCACACAGACTGTCTGCCTGGATGA 
CACAACAGTCAAGTTTGAGATCTGGGACACAGCTGGACAGGAGCGGTATCACAGCCTGGCCCCCATG 
TACTATCGGGGGGCCCAGGCTG^CATCGTGGTCTATGACAT 

CCAAGAACTGGGTGAAGGAGCTACAGAGGCAGGCCAGCCCCAACATCGTCATTGCACTCGCGGGTAA 
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CAAGGCAGACCTGGCCAGCAAGAGAGCCGTGGAATTC^GG^ 
AGTTTGCTGTTCATGGAGACATCAGCAAAGACTGCJUVTC 

CTAAGAAGCTTCCCAAGAACGAGCCCCAGAATGCAACTGGTGCTCCAGGCCGAAACCGAGGTGTGGA 
CCTCCAGGAGAACAACCCAGCCAGCCGGAGCCAGTGCTGCAGCAACTGAGCCCCCCTTGCCTGCCCG 
CTGCCCCCGCCTCCTCCGCCTGAATGACCCGACTGGAATCCACTCTAACCAATCGCACTTAACGACT 


CGGGCCACCACTGGGGGGGCAGGGGGAGGGGTCCACCATGATTTCTCCATATAATTTTGATCATAGG 


CCGGAGTGAGTCATTCCACCTG • 




ORF Start: ATG at 136 joRF Stop: TGA at 784 






SEQIDNO:218 j216aa |MW at 23567.4kD 


NOV50a, 
CG57284-01 
Protein Sequence 


MAGRGGARRPNGPAAGTOICQFKLVLLGESAVGKSSLVLRFVKG 
ITVKFEIWOTAGQERYHSLAPMYYRGAQAAIVVYDITNTDTFARAK^ 
KADLASKRAVEFQEAQAYADDNSIXFMETSAKTAM^^ 
LQENNPASRSQCCSN 





SEQE>NO:219 |747bp 


NOV50b, 
CG57284-03 
DNA Sequence 

* 


CCACTAAGTGCCTCTTTGCATAGCACCAGTCCCCACCCGCACGCTCTCTGGACCACTACAGCTGGAC 


GGGCAATGGCGGGTCGGGGAGGCGCAGCACGACCCAATGGACCAGCTGCTGGGAACAAGATCTGTCA 
ATTTAAGCTGGTTCTGCTGGGGGAGTCTGCGGTAGGCAAATCCAGCCTCGTCCTCCGCTTTGTCAA^ 
GGACAGTTTCACGAGTACCAGGAGAGCACAATTGGAGCGGCCTTCCTCACACAGACTGTCTGCCTGG 
ATGACAC AACAG TCAAGTTTG AGATC TGGGACAC AG C TGGAC AGG AGCGGT ATC ACAGC CTGGCC C C 
CATGTACTATCGGGGGGCCCAGGCTGCCATCGTGGTCrATGACATCACCAACATCGTCATTGCGCTC 
GCGGGTAACAAGGCAGACCTGGCCAGCAAGAGAGCCGTGGAATTCCAGGAAGCACAAGCCTATGCAG 
ACGACAACAGTTTGCTGTTCATGGAGACATCAGCAAAGACTGCAATGAACGTGAACCAAATCTTCAT 
GGC AATAGCTAAG AAGC TTCCCAAGAACGAGCCCCAGAATGCAACTGGTGC TC C AGGCCGAAACCGA 
GGTGTGGACCTCCAGGAGAACAACCCAGCCAGCCGGAGCCAGTGCTGCAGCAACTGAGCCCCCCTTG 
CCTGCCCGCTGCCCCCGCCTCCTCCGCCTGAATGACCCGACTGGAATCCACTCTAACCAATCGCACT 


TAACGACTCG 




ORF Start: ATG at 73 j jORF Stop: TGA at 658 





SEQIDNO:220 |l95aa |MW at 21039.6kD 


NOV50b> 
CG57284-03 
Protein Sequence 


MAGRGGAARP^PAAGl^ICQFKLVLLGESAVGK S SL VLRF VKGQFHE YQES T IGAAFLTQTVCLDD 
TTVK FE IWDTAG QERYHS r^PMYYRGAQ AA IVVYD I TN I V I ALAGNKADLASKRAVEFQEAQAYADD 
NSLLFMETS AKTAMNVNE I FMAI AKKLPKNE PQNATG A PGRNRGVDLQENNPASRSQCC SN 





SEQIDNO:221 ^819 bp 




NOV50c, 
CG57284-02 
DNA Sequence 


AATCGCC TTCCACTAAGTGCC TCTTTGCATAG^ACCAGTCCCCACCCGCACGCTCTCTGGACCACTA 


CAGCTGGACGGGCAATGGCGGGTCGGGGArarrcranr^^ 

GATCTGTCAATTTAAGCTGGTTCTGC TGGGGGAGTCTGCGGTAGGCAAATCCAGCC TCGTCCTCCGC 
TTTGTCAAGGGACAGTTTCACGAGTACCAGGAGAGCACAATTGGAGCGGCCTTCCTCACACAGACTG 
TCTGCC TGGATGAC ACAAC AGTCAAGTTTGAGATC TGGG AC AC AGC TGGACAGGAGCGGTATC AC AG 
CCTGGCCCCCATGTACTATCGGGGGGCCCAGGCTGCCATCGTGGTCTATGACATCACCAACACAGAT 
ACATTTGCACGGGCCAAGAACTGGGTGAAGGAGCTACAGAGGCAGGCCAGCCCCAACATCGTCATTG 
CACTCGCGGGTAACAAGGCAGACCTGGCCAGCAAGAGAGCCGTGGAATTCCAGGAAGCACAAGCCTA 
TGCAGACGACAACAGTTTGCTGTTCATGGAGACATCAGCAAAC^CTGCAATGAACGTGAACGAAATC 
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TTCATGGCAATAGCTAAGAAGCTTCCCAAGJ^CGAGCCC^ 

ACCGAGGTGTGGACCTCCAGGAGAACAACCCAGCCAGCCGGAGCCAGTGCTGCAGCAACTGAGCCCC 




ORF Start: ATG at 82 J joRF Stop: TGA at 730 



"3» 

...JL* 





SEQ E) NO: 222 J2I6 aa MW at 23482.3kD 


NOV50c, 
CG57284-02 
Protein Sequence 


MAGRGGAARPNGPAAGITCICQFfa^VIiLGESAVGKSSLVLR^ 
TTVKFEIWDTAGQERYHSLAPMYYEGAQAAIVVYDITNTDTFAR^ 

KADIASKRAVEFQEAQAYADDNSLLFMETS AKTAMNVNE I FMAIAKKL PKNEPQNATGAPGRNRGVD 
LQENNPASRSQCCSN 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table SOB. 



Table SOB. Comparison of NOVSOa against NOVSOb and NOVSOc. 


Protein Sequence 


NOVSOa Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOVSOb 


18..216 
18..195 


178/199 (89%) 
178/199 (89%) 


NOVSOc 


18..216 

18.216 J 


199/199 (100%) 
199/199(100%) 



Further analysis of the NOV50a protein yielded the following properties shown in 
Table 50C. 



Table 50C Protein Sequence Properties NOVSOa 


PSort analysis: 


0.6500 probability located in cytoplasm; 0.2189 probability.located in 
lysosome (lumen); 0.1000 probability located in mitochondrial matrix space; 
0.0000 probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



A search of the NOVSOa protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table SOD. 
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Table SOD. Geneseq Results for NOVSOa 


Identifier 


x roiein/ organism/ Lengtu 
[Patent*, Date] 


NOV50a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


AAM79225 


Human protein SEQ ID NO 
too/ - Homo sapiens, 215 aa. 
[WO200157190-A2, 
09-AUG-2001] 


9..216 
8..215 


179/208 (86%) 
194/208 (93%) 


e-101 


AAY56173 


Human Wnt-1 amino acid 
sequence - Homo sapiens, 
215 aa. [CA2200794-A, 
24-SEP-1998] 


9..216 
8..215 


179/208 (86%) 
194/208 (93%) 


e-101 


AAB28187 


Human RAS-relates protein 
RAB-5A - Homo sapiens, 
193 aa. [WO200052165-A2; 
08-SEP-20001 


1..197 
1..192 


178/197 (90%) 
186/197 (94%) 


9e-97 


AAM80209 


Human protein SEQ ID NO 
3855 - Homo sapiens, 255 aa. 
[WO200357190-A2, 
09-AUG-2001] 


9..216 
47..255 


172/209(82%) 
189/209(90%) 


le-95 


ABB60036 


Drosophila melanogaster 
polypeptide SEQ ID NO 
6900 -Drosophila 
melanogaster, 219 aa. 
[WO200171042-A2, 
27-SEP-2001] ] 


2..214 
11.-218 


159/213 (74%) 
177/213 (82%) 


8e-85 



In a BLAST search of public sequence datbases, the NOV50a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 50E. 



Table 50E, Public BLASTP Results for NOVSOa 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV50a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


P51148 


Ras-related protein Rab-5C 
(RAB5L)(L1880)-Homo 
sapiens (Human), 216 aa. 


1..216 
1..216 


216/216(100%) 
216/216(100%) 


e-122 
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AAM21086 


Small GTP binding protein 
RAB5C - Homo sapiens 


p 

1..216 
1..216 


,~» ~|;~ ... » :t il""> if "h — |i 

215/216 (99%) 
215/216 (99%) 


e-121 


Q8R1V8 


Hypothetical 23.4 kDa 
protein - Mus muscul us 
(Mouse), 216 aa. 


1..216 
1..216 


212/216 (98%) 
213/216 (98%) 


e-119 


P51147 


Ras-related protein Rab-5C - 
Cards familiaris (Dog), 216 
aa. 


1..216 
1..216 


212/216 (98%) 
213/216 (98%) 


e-119 


Q98932 


Rab5C-like protein - Gallus 
gallus (Chicken), 216 aa. 


1..216 
1..216 


203/216(93%) 
208/216 (95%) 


e-114 



PFam analysis predicts that the NO V50a protein contains the domains shown in the 
Table 50F. 



Table 50F. Domain Analysis of NOV50a 


Pfam Domain 


NOVSOa Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


arf 


4..185 


40/198 (20%) 
105/198 (53%) 


0.0018 


IBS 


23..216 j 


90/209 (43%) 
181/209(87%) 


3.1e-104 



Example 51. 

The NOV51 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 51 A. 



Table 51A. NOV51 Sequence Analysis 




SEQ ID NO: 223 


4826 bp | 


NOV51a, 
CG57308-01 
DNA Sequence 


AGCTGAGCCCGAGCCCAGACCGCGCCCGCGCCGCCATGK^CCCTGGCCTTCTGCGGCAGCGAGAAnCIA 


CTCGGCCGCCTACCGGGTGGACCAGGGGGTCCTCAACAACGGCTGCTTTGTGGACGCGCTCAACGTG 
GTGCCGC ACGTC TTCCTACTCTTC ATC ACCTTCCCCATCCTCTTCATTGGATGGGGAAGTCAGAGC T 
CCAAGGTGCACATCCACCACAGCACATGGCTTCATTTCCCTGGGCACAACCTGCGGTGGATCCTGAC 
CTTCATGCTGCTCTTCGTCCTGGTGTGTGAGATTGCAGAGGGCATCCTGTCTGATGGGGTGACCGAA 
TCCCACCATCTGCACCTGTACATGCCAGCCGGGATGGCGTTCATGGCTGCTGTCACCTCCGTGGTCT 
ACTATC ACAAC ATCGAGAC TTCCAACTTCCCCAAGC TGCTAATTGC CCTGCTGGTGTATTGGACCCT 
GGCCTTCATCACCAAGACCATCAAGTTTGTCAAGTTCTTGGACCACGCCATCGGCTTCTCGCAGCTA 
CGCT TCTGCCTC AC AGGGC TGCTGGTGATCCTCTATGGGATGCTGC TCCTCG TGGAG GTCAATGTCA 
TCAGGGTGAGGAGATACATC T TCTTCAAGACACCGAGGG AGGTGAAGCC TCCCGAGGACCTGCAAGA 
CCTGGGGGTACGCTTCCTGCAGCCCTTCGTGAATCTGCTGTCCAAAGGCACCTACTGGTGGATGT^AC 
GCCTTCATCAAGACTGCCCACAAGAAGCCCATCGACTTGCGAGCCATCGGGAAGCTGCCCATCGCCA 
TGAGGGCCCTCACCAACTACCAACGGCTCTGCGAGGCCTTTGACGCCCAGGTGCGGAAGGACATTCA 
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GGGCACTCAJ^TKCCG^ 

AGCAGCACTTTCCGCATCTTGGCCGACCTGCTGGGCTTCGCCGGGCCACTGTGCATCTTTGGGATCG 
TGGACCACCTTGGGAAGGAGAACGACGTCTTCCAGCCCAAGACACAATTTCTCGGGGTTTACTTTGT 
CTCATCCCAAGAGTTCCTTGCCAATGCCTACGTCTTAGCTGTGCTTCTGTTCCTTGCCCTCCTACTG 
CAAAGGACATTTCTGCAAGCATCCTACTATGTGGCCATTGAAACTGGAATTAACTTGAGAGGAGCAA 
TACAGACCAAGATTTACAATAAAATTATGCACCTGTCCACCTCCAACCTGTCCATGGGAGAAATGAC 
TGCTGGACAGATCTGTAATCTGGTTGCCATCGACACCAATCAGCTCAT^ 

CCAAACCTCTGGGCTATGCCAGTACAGATCATTGTGGGTGTGATTCTCCTCTACTACATACTCGGAG 
TC AGTGCCTTAATTGGAGC AGCTGTC ATCATTC TACTGGCTCCTG TCCAGTAC TTCGTGGCC ACCAA 
GCTGTCTCAGGCCCAGCGGAGCACACTGGAGTATTCCAATGAGCGGCTGAAGCAG^CCAACGAGATC 
C TCCGCGGCATCAAGC TGCTG AAGC TGTACGCCTGGGAGAACATC TTCCGCACG CGGGTGGAGACGA 
CCCGCAGGAAGGAGATGACCAGCCTCAGGGCCTTTGCCATCTATACCTCCATCTCCATTTTCATGAA 
CACGGCCATCCCCATTGCAGCTGTCCTCATAACTTTCGTGGGCCATGTCAGCTTCTTCAAAGAGGCC 
GACTTCTCGCCCTCCGTGGCCTTTGCCTCCCTCTCCCTCTTCCATATCTTGGTCACACC^CTGTTCG 
TGCTGTCCAGTGTGGTCCGATCTACCGTGAAAGCTCTAGTGAGCGTGCAAAAGCTAAGCGAGTTCCT 
GTCCAGTGCAGAGATCCGTGAGGAGCAGTGTGCCCCCCATGAGCCCACACCTCAGGGCCCAGCCAGC 
AAGTACCAGGCGGTGCCCCTCAGGGTTGTGAACCGCAAGCGTCCAGCCCGGGAGGATTGTCGGGGCC 
TC ACCGGCCCACTGC AGAGCCTGGTCCCC AGTGCAGATGGCG ATGC TGAC AAC TGC TGTGTC C AGAT 

CATGGGAGGCTACTTCACGTGGACCCCAGATGGAATCCCCACACTGTCCAACATCACCATTCGTATC 
CCCCGAGGCCAGCTG AC TATGATCGTGGGGC AGGTGGGC TGCGGCAAGTCC TCGCTCC TTCTAGCCG 

CACTGGGGGAGATCCAGAAGGTCTC^GGGGCTGTCTTCTGGAGCAGCCTTCCTGACAGCGAGATAGG 
AGAGGACCCCAGCCCAGAGCGGGAGACAGCGACCGACTTGGATATCAGGAAGAGAGGCCCCGTGGCC 
TATGCTTCGC AGAAACCATGGC TGCTAAATGCC ACTGTGGAGG AGAAC ATCATCTTTGAGAGTCCC T 
TCAACAAACAACGGTACAAGATGGTCATTGAAGCCTGCTCTCTGCAGCCAGACATCGACATCCTGCC 
CCATGGAGACCAGACCCAGATTGGGGAACGGGGCATCAACCTGTCTGGTGGTCAACGCCAGCGAATC 
AGTGTGGCCCGAGCCCTC TACCAGCACGCCAACGTTGTCTTC TTGGATGACCCC TTCTCAGCTC TGG 

ATATCCATCTGAGTGACC^CTOAATGCAGGCCGGCATCCTTGAGCTGCTCCGGGACGACAAGAGGAC 
AGTGGTCTTAGTGACCCACAAGCTACAGTACCTGCCCCATGCAGACTGGATCATTGCCATGAAGGAT 
GGCACCATCCAGAGGGAGGGTACCCTCAAGGACTTCCAGAGGTCTGAATGCCAGCTCTTrcAGCACT 
GGAAGACCCTCATGAACCGACAGGACCAAGAGCTGGAGAAGGAGACTGTCACAGAGAGAAAAGCCAC 
AGAGCCACCCCAGGGCCTATCTCGTGCCATGTCCTCGAGGGATGGCCTTCTGCAGGATGAGGAAGAG 
GAGGAAGAGGAGGCAGCTGAGAGCGAGGAGGATGACAACCTGTCGTCCATGCTGCACCAGCGTGCTG 
AGATCCCATGGCGAGCCTGCGCCAAGTACCTGTCCTCCGCCGGCATCCTGCTCCTGTCGTTGCTGGT 
CTTCTC ACAGC TGCTCAAGCACATGGTCCTGGTGGCCATCGACTAC TGGC TGGCCAAGTGGACCGA( 

CTGTCTATGCCATGGTCTTCACGGTGCTCTGCAGCCTGGGCATTGTGCTGTGCCTCGTCA 
CACTGTGGAGTGGACAGGGCTGAAGGTGGCCAAGAGACTGCACCGCAGCCTGCTAAAC^ 
CTAGCCCCCATGAGGTTTTTTGAGACCACGCCCCTTGGGAGCATCCTGAACAGATTTTCATCTGACT 
GTAACACC^TCGACCAGGACA!TCCCATCCACX3CTGGAGTGCCTGAGCCGCTCCACCCTC 
CTCAGCCCTGGCCGTCATCTCCTATGTCACACCTGTGTTCCTCGTGGCCCTCTTGCCCCTGGCCATC 
GTGTGCTACTTCATCCAGAAGTACTTCCGGGTGGCGTCCAGGGACCTGCAGCAGCT^ 
CCCAGCTTCCACTTCTCTCACACTTTGCCGAAACCGTAGAAGGACTCACCACCATCCGGGCCTTCAG 
GTATGAG GCCCGGTTCC AGCAGAAGCT TCTCGAATACAC AGAC TC CAACAACAT TGC TTC CCTC T TC 
CTCACAGCTGCCAACAGATGGCTGGAAGTCCGAATGGAGTACATCGGTGCATGTGTGGTGCTCATCG 
CAGCGGTG ACCTCCATC TCC AACTCCC TGCACAGGGAGCTCTC TGCTGGCCTGGTGGGCC TGGGCC T 
TACCTACGCCC TAATGGTCTCCAACTACCTCAACTGGATGGTGAGGAACC TGGCAGAC ATGGAGCTC 
CAGCTGGGGGCTGTGAAGCGCATCCATGGGCTCCTGAAAACCGAGGCAGAGAGCTACGAGGGGCTCC 
TGGCACCATCGCTGATCCCAAAGAACTGGCCAGACCAAGGGAAGATCCAGATCCAGAACCTCAGCGT 
GCGCTACGACA.GCTCCCTGAAGCCGGTGCTGAAGCACGTCAATGCCCTCATCTCCCCTGGACAGAAG 
ATCGGGATCTGCGGCCGCACCGGCAGTGGGAAGTCCTCCTTCTCTCTTGCCT^ 
ACACGTTCGAAGGGCACATCATCATTGATGGCATTGACATCGCCAAACTGCCGCTGCACACCCTGCG 
C TCACGCCTCTCCATCATCCTGCAGGACCCCGTCC TC TTCAGCGGC ACCATC CGATTTAACCTGGAC 
CCTGAGAGGAAGTX3CTCAGATAGCACACTGTGGGAGGCCCTGGAAATCGCCCAGCTGAAGCTGGTGG 
TGAAGGCACTGCCAGGAGGCCTCGATGCC^TCATCACAGAAGGCGGGGAGAATTTCAGCCAGGGACA 
GAGGCAGCTGTTCTGCCTGGCCCGGGCCTTCGTGAGGAAGACCAGCATCTTCATCATGGACGAGGCC 
ACGGCTTCCATTGACATGGCCACXSGAAAACATCCTCCAAAAGGTGGTGATGACAGCCTTCGCAGAC^ 
GCAC TGTGGTCACCATCGCGCATCGAGTGCAC ACCATCC TGAGTGCAGACC TGGTGATCGTCCTGAA 
GCGGGGTGCCATCCTTGAGTTCGATAAGCCAGAGAAGCTGCTCAGCCGGAAGGACAGCGTCTTCGCC 

TCCTTCGTCCGTGCAGACAAGTGA CCTOCCAGAGCCCAA^CCATCCCACATTCC-GACCCTGGC rA 
TA ~ ' 



ORF Start: ATG at 36 



jORF Stop: TGA at 4779 





SEQ ID NO: 224 1581 aa jMW at 177005.9kD 


NOV51 a> 
CG57308-01 
Protein Sequence 


MPL AFCGSEI^S AAYRVDQGVLNNGCFVDALNVVPHVFLL F I TF P ILF IGWG S QSSKVHIHHS TWLH 
FPGHNIiRWI LTFMLLFVLVC E I AEGILSDGVTE SHHLHL YMPAGMAFMAAVT S WYYHNI ET SNFPK 
LLIALLVYWTLAFITKTIKFVKFLDHAIGFSQLRFCLTGLLVILYGMLLLVEV^ 
REVKPPEDLQDLGVRFLQPFVl^LSKGTYW^ 

AFDAOVRKDIOGTOGARAIWO ALS HAFGRRLVLS STFR ILADLLGFAGPLC IFG I VDHLGKENDVFO 
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— — . — ~ p p.x/ b n^;m;;3 

PKTQFLGVYFVSSQEFLANAYVL^^ 

ST SNLSMGEMTAGQI CNLVA I DTNQU4WFFFLCPNLWAMPVQI I VGVTLL YYI LG VSALIGAAVT IL 
LAPVQYFVATKLSQAQRSTLEYSNERLKQTNEMLRGIK^ 

AIYTS I S IFMNTAI PIAAVL ITFVGHVSFFKEADFSPSVAFASIiSLFHILVTPLFLLSSVVRSTVKA 
LVSVQKLSEFLSSAfiraEEQCAPHEPTPQGPASKYQAVPIiRV^^ 

dgdadnccvqimggyftwtpdgiptlsnitiriprgqltmivgqvgcgksslllaalgemqkvsgav 

FWSSLPDSEIGEDPSPERETATDL0IRKRGPVAYASQKPWLLNATVBEOTIFESPFNKQRYKMVIEA 
CSLQ PDIDILPHGDQTQI GERGINL SGGQRQRI SVARAL YQHANWFLDDPFSALDIHL SDHLMQAG 
ILELLRDDKRTWIA/THKLQYLPHAD^^ 

EKETVTERKATEPPQGLSRAMSSR1X5LLQDEEEEEEEAAESEEDDNLSSMLHQRAEIPWRAC7UCYLS 

SAGILLLSLLVFSQLLKHimiVAIDYV^AKOTDSALTLTPAARNCSLSQECTLDQTVYAMVFT^ 

LGIVLCLVTSVTVEWTGLKVAKRLHRSLL^ 

ECLSR STLLCVS ALAVI S YVT PVFLVALLPLAI VC YF IQKYFRVASRDLQQLDDTTQLPLLSHFAET 
VEGLTTIRAFRYEARFQQKLLEYTDSIWIASLFLTAANRV^EVRMEYIGACVVLIAAVTSISNSLHR 
ELSAGLVGI^LTYALWSNYLNWMVRmJ^^ 

QGKIQIQNLSVRYDSSLKPVLKHVNALISPGQKIGICGRTGSGKSSFSLAFFRMVDTFEGHIIIDGI 
DIAKLPI^TI^SRLSIILQDPVLFSGTIRFNIJ^PERKCSDSTLWEALEIAQLKLVVKALPG^ 
TEGGENF SQGQRQLFCLARAFVRKTS IFIMDEATAS IDMATEN ILQKWMTAFADRTWTIAHRVHT 
ILSADLVi\^KRGAILEFDKPEKLLSRKDSVFASFVRADK 





SEQIDNO:225 |4745 bp | 


NOVSlb, 
CG57308-02 
DNA Sequence 

■ 

• 


CGGGGCCCGGGGGGCGGGGGCCTGACGGCCGGGCCGGGCGGCGGAGCTGCAAGGGACAGAGGCGCGG 


CAGGCGCGCGGAGCCAGCGGAGCCAGCTGAGCCCGAGCCCAGCCCGCGCCCGCGCCGCCATGCCCCT 


GGCCTTCTGCGGCAGCGAGAACCACTCGGCCGCCTACCGGGTGGACCAGGGGGTCCTCAACAACGGC 
TGCTTTGTGGACGCGCTCAACGTGGTGCCGCACGTCTTCCTACTCTTCATCACCTTCCCCATCCTCT 

tcattggatggggaagox:agagctccaaggtgcacatccaccacagcacatggcttcatttccccgg 

GC AC AACC TGCGGTGGATCCTGACC TTC ATGCTGC TCTTCGTCCTGGTGTGTGAGATTGCAGAGGGC 
ATCCTGTCTGATGGGGTGACCGAATCCCACCATCTGCACCTGTACATGCCAGCCGGGATGGCGTTCA 
TGGC TGCTGTC ACC TCCGTGGTCTACTATCACAACATCGAGACT TCCAACTTCCCC AAGCTGCTAAT 
TGCCCTGCTGGTGTATTGGACCCTGGCCTTCATGACCIAAGACCA 

CACGCCATCGGCTTCTCX3CAGCTACGCTTCTGCCTCACAGGGCTGCTGGTGATCCTCTATGGGATGC 

TGCTCCTCGTCGAGGTCAATGTCATCAGGGTGAGGAGATACATCTTCTTCAAGACACCGAGGGAGGT 

GAAGCCTCCCGAGGACCTGC AAGACCTGGGGGTACGCTTCCTGCAGCC C TTCGTGAATCTGCCGTCC 

AAAGGCACC TAC TGGTGGATGAACGC CTTCATCAAGAC TGCCCAC AAGAAGC CC ATCGACTTGCGAG 

CCATCGGGAAGCTGCCCATCGTTATGAGGGCCCTCACCAACTACCAACGGCTCTGCGAGGCCTTTGA 

CGCCCAGGTGCGGAAGGACATTCAGGGCACTCAAGGTGCCCGGGCCATCTGGCAGGCACTCAGCCAT 

GCCTTCGGGAGGCGCCTGGTCCTCAGCAGCACTTTCCGCATCTTGGCCGACCTGCTGGGCTTCGCCG 

GGCCACTGTGCATCTTTGGGATCGTGGACCACCTTGGGAAGGAGAACGACGTCTTCCAGCCCAAGAC 

ACAATTTCTCGGGGTTTACTTTGTCTCATCCCAAGAGTTCCTTGCCAATGCCTACGTCTTAGCTGTG 

CTTCTGTTCCTTGCCCTCCTACTGCAAAGGAC^TTTCTGCAAGCATCCTACTATGTG 

CTGGAATTAACTTGAGAGGAGCAATACAGACCAAGATTTACAATAAAATTATGCACCTGTCCACCTC 

CAACCTGTCCATGGGAGAAATGACTGCTGGACAGATCTGTAATCTGGTTGCCATCGACACCAATCAG 

CTCATGTGGTTTTTCTTCTTGTGCCCAAACCTCTGGGCTATGCCAGTACAGATCATTGTGGGTGTG^ 

TTCTCCTCTACTACATACTCGGAGTCAGTGCCTTAATTGGAGCAGCTGTCATCATTCTACTGGCTCC 

TGTCCAGTACTTCGTGGCCACCAAGCTGTCTCAGGCCCAGCGGAGCACACTGGAGTATTCCAATGAG 

CGGCTGAAGCAGACCAACGAGATGCTCCGCGGCATCAAGCTGCTGAAGCTGTACGCCTGGGAGAACA 

TCTTCCGCACGCGGGrGGAGACGACCCGCAGGAAGGAGATGACCAGCCTCAGGGCCTTTGCCATCTA 

TACCTCCATCTCCATTTTCATGAACACGGCCATCCCCATTGCAGCTGTCCTCATAACTTTCGTGGGC 

CATGTCAGCTTCTTCAAAGAGGCCGACTTCTCGCCCTCCGTGGCCTTTGCCTCCCTCTCCCTCTTCC 

ATATC TTGGTC AC ACCGCTGTT CC TGCTGTCC AGTGTGGTCCGATC TACCGTCAAAGCTCTAGTGAG 

CGTGCAAAAGCTAAGCGAGTTCCTGTCCAGTGCAGAGATCCGTGAGGAGCAGTGTGCCCCCCATGAG 

CCCACACCTCAGGGCCCAGCCAGCAAGTACCAGGCGGTGCCCCTCAGGGTTGTGAACCGCAAGCGTC 

C AGCCCGGGAGGATTGTCGGGGCC TGACCGGCCCACTGC AGAGC CTGGTCCCCAGTGCAGATGGCGA 

TGCTGACAACTGCTGTGTCCAGATCATGGGAGGCTACTTCACGTGGACCCCAGATGGAATCCCCACA 

CTGTCCAAC^TCACCATTCGTATCCCCCGAGGCCAGCTGACTATGATCGTGOTGCAGGTGGGCTGCG 

GC^AAGTCCTCGCTCCTTCTAGCCGCACTGGGGGAGATGCAGAAGGTCTCAGGGGCTGTCTTCTGGAG 

CAGCC TTCCTGACAGCG AGATAGGAGAGGACCCCAGCCCAGAGCGGGAGACAGCGACCGAC T TGGAT 

ATCAGGAAGAGAGGCCCCGTGGCCTATGCTTCGCAGAAACCATGGCTGCTAAATGCCACTGTGGAGG 

AGAACATCATCTTTGAGAGTCCCTTCAACAAACAACGGTACAAGATGGTCATTGAAGCCTGCTCTCT 

GCAGCCAGACATCGACATCCTGCCCCATGGAGACCAGACCCAGATTGGGGAACGGGGCATCAACCTG 

TCTGGTGGTCAACGC CAGCGAATC AGTGTGGCCCGAGCCC TCTACC AGC ACGCC AACGTTGTC TTC T 

TGGATGACCCCTTC TC AGCTC TGGATATC C ATCTGAGTGACCACTTAATGCAGGCCGGC ATC CTTGA 

3CTGCTCCGGGACGACAAGAGGACAGTGGTCTTAGTGACCCACAAGCTACAGTACCTGCCCCATGCA 

3ACTGGATCATTGCCATGAAGGATGGCACCATCCAGAGGGAGGGTACCCTCAAGGACTTCCAGAGGT 

ZTGAATGCCAGCTCTTTGAGCACTGGAAGACCCTCIATGAACCGACAGGACCAAGAGCTGGAGAAGGA 

3ACTGTCACAGAGAGAAAAGCCACAGAGCCACCCCAGGGCCTATCTCGTGCCATGTCCTCGAGGGAT 

3GCCTTCTGCAGGATGAGGAAGAGGAGGAAGAGGAGGCAGCTGAGAGCGAGGAGGATGACAACCTGT 

-GTCXIATGCTGCACCAGCGTGCTGAGATCCCATGGCGAGCCTGCGSC^ 

:ATCCTGCTCCTGTCGTTGCTGGTCTTCTCACAGCTGCTCAAGCACATGGTCCTGGTGGCCATCGAC 
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i i p p.T s b. ir C: jrn :p / «n -a - 

TACTGGCTGGCCAAGTGGACCGACAGCGCCCTGACCCTGA^ 

GCCAGGAGTGCACCCTCGACCAGACTGTCTATGCCATGGTGTTCAC GGTGCTC TGCAGCC TGGGCAT 
TGTGCTGTGCCTCGTCACGTCTGTCACTGTGGAGTGGACAGGGCTGAAGGTGGCCAAGAGACTGCAC 
CGCAGCCTGCTAAACCGGATCATCCTAGCCCCCATGAGGTTTTT'I'GAGACCACGCCCCTTGGGAGCA 
TCCTGAACAGATTTTCATCTGACTGTAACACCATCGACCAGCACATCCCATCCACGCTGGAGTGCCT 
GAGCCGCTCCACCCTGCTCTGTGTCTCAGCCCTGGCCGTCATCTCCTATGTCACACCTGTGTTCCTC 
GTGGCCCTCTTGCCCCTGGCCATCGTGTGCTACTTC^TCCAGAAGTACTTCCGGGTGGCGTCCAGGG 
AC CTGCAGCAG CTGGATGAC ACC ACCC AGCTTCC ACTTCTCTC AC AC TTTGCCGAAACCGTAGAAGG 
ACTC^CCACCATCCGGGCCTTCAGGTATGAGGCCCGGTTCCAGCAGAAGCTTCTCGAATACACAGAC 
TCCAACMCATTGCTTCCCTCTTCCTCACAGCTGCCAACAGATGGCTGGAAGTCCGAATGGAGTACA 
TCGGTGCATGTGTGGTGC TCATCGCAGCGGTGACCTCCATCTCCAAC TCCCTGC AC AGGGAGCTC TC 

>, * awvjuvv- a J rtu v. Jl AAi uu 1 U i\, WiAU 1 Al-V-TwiALTGGATGGTG 

AGGAACCTGGCAGACATGGAGCTCCAGCTGGGGGCTGTGAAGCGCATCCATGGGCTCCTGAAAACCG 
AGGCAGAGAGCTACGAGGGGCTCCTGGCACCATCGCTGATCCCAAAGAACTGGCCAGACCAAGGGAA 
GATCCAGATCC AGAACC TG AGCGTGCGCTACG ACAGCTCCCTGAAGCCGGTGC TGAAGCACGTC AAT 
GCCCTCATC TCCCC TGG AC AG AAGATCGGGATCTGCGGCCGCACCGGCAG TGGGAAGTCC TC CTTCT 
CTCTTGCCTTCTTCCGCATGGTGGACACGTTCGAAGGGCACATCATCACAGAAGGCGGGGAGAATTT 
CAGCCAGGGACAGAGGCAGCTGTTCTGCCTGGCCCGGGCCTTCGTGAGGAAGACCAGCATCTTCATC 
ATGa^AGGCCACG^TTCCATTGACATGGCCACGGAAAACATCCTCCAAAAGGTGGTGATGACAG 
CCTTCGCAGACCGCACK5TC^TCACCATCGCGCATCG&GTGCA^^ 

GATCGTCCTGAAGCGGGGTGCCATCCTTGAGTTCGATAAGCCAGAGAAGCTGCTCAGCCGGAAGGAC 
AGCGTCTTCGCCTCCTTCGTCCGTGCAGACAAGTGACCTGCCAGAGCCCAAGTGCCATrrrara«TTn 
GGACCCTGCCCATACCCCTGCCTGGGTTTTCTA ArTOTA A awirnw-na a a™ a 


ORF Start: ATG at 127 | jORF Stop: TGA at 4657 





SEQ ID NO: 226 jl510 aa |MW at 169179.9kD 


NOV51b, 
CG57308-02 
Protein Sequence 


MPLAFCGSEiraSAAYRVDQGVLNN^ 
FPGHNLRWJXTFMLLFVLVCEIAEGILSDGWTESH^ 

LL I ALLVYWTLAF ITKTI KF VKLLDHAT. GFSQLRFCIjTGLLVILYGMLLLVEVOTIRVRRYI FFKTP 
REVKPPEDLQDLGVRFLQPFVNLPSKGTYWWMNA^ 

AFDAQVRKDI QGTQGARAI WQALSHAFGRRLVLSS TFR IL AI>LLGFAGPIiC IFG I VDHLGKENDVFQ 
PKTQFLGVYFVS SQEFLA1JAYVLAVLLFLALLLQRTFLQAS YYVAIETG INLRGAIQTKIYNKIMHL 
STSNLSMGEMTAGQICNLVAIDTNQLMWFFFLCPl^WAMPVQI IVGVILLYYILGVSALIGAAVI XL 
LAPVQYFVATKL SQAQRS TLEYSNERLKQTNEMLRG IKLLKL YAWENI FRTRVETTRRKEMTSLRAF 
AIYTSISIFI^TAIPIAAVIjITFVGHVSFFKEADFSPSVAF 

LVS VQKLSEFLS S AE IREEQC APHEPT PQGPASKYQAVPLRWNRKRPAREDCRGLTGPLQSLVPSA 
DG DADNCCVQ IMGG YF TWT PDGI PTLSNITIRI PRGQLTMIVGQVGCGKSSLLLAALGEMQKVSGAV 
FWSSLPDSEIGEDPSPERETATDLDIRKRGFVAYASQKPTO^^ 
C SLQPDIDILPHGDQTQIGERGIl^ SGGQRQR ISVARALYQHA^ 

I LELLRDDKRTVVL VTHKLQ YL PHADWI I AMKDGT I QREGTLKDFQRS ECQL FEHWKTLMNRQDQEL 

EKETVTERKATEPPQ/3LSRAMSSRDGLLQDEEEEEEEAAESEEDDNLSSMLHQRAEIPWRACAKYLS 
SAGILI*LSLLVFSQLLKHMVI,VAIDYWIJ^TD^ 

LGIVLCLVTSVTVEWTGI^AKJ^HRSLLNRIILAPMRFFETTPLGSIL 

ECLSRSTLLCVSALAVISYVTPWLVALLPLAIVCYFIQKYFRVASRDLQQLDDTTQLPLLSHFAET 
VEGL T T I RAFR YEARFQQKL L B YTDSNNI ASL FL TAANRWL EVRME YI GACWL IAAVT S I SNS LHR 
ELSAGLVGLGLTYALMVSNYL>WMVRNIjA^ i pknwpd 

QGKIQIQNLSVRYDSSLKPVLKHVNALISPGQKIGXCGRTGSGKSSFSLAFFRMVDTFEGHIITEGG 

ENFSQGQRQLFCLARAFVRKTSIFIMDEATASIDilATENILQKVVMTAFM 

DLVI VLKRGAILEFDK PEKLL SRKDS VFAS FVRADK 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 51B. 

10 



Table 51B. Comparison of NOV51s| against NOVSlb. 


Protein Sequence 


NOVSla Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 
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NOVSlb 



1..1406 
1..1406 



1285/1406 (91%) 
1286/1406 (91%) 



ia,/ 3 ±3 sea 



Further analysis of the NOV51a protein yielded the following properties shown in 
Table 51C. 



Table 51C Protein Sequence Properties NOV51a 


PSort analysis: 


0.8000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 0.3000 probability located in endoplasmic reticulum (membrane); 
0.3000 probability located in microbody (peroxisome) 


SignalP analysis: 


Cleavage site between residues 56 and 57 



A search of the NOV5 la protein against the Geneseq database, a proprietary 
10 database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 5 ID. 



Table 51D. Geneseq Results for NOV51a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOVSla 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAW57412 


Homo sapiens sulphonylurea 
receptor - Homo sapiens, 
1580 aa. [W09814571-A1, 
09-APR-1998] 


1.1581 
1..1580 


1530/1582(96%) 
1540/1582(96%) 


0.0 


AAR77087 


Rat sulphonylurea receptor - 
Rattus sp, 1582 aa. 
[W09528411-A1, 
26-OCT-1995] 


1.-1581 
1..1582 


1477/1582(93%) 
1509/1582(95%) 


0.0 


AAR77088 


Hamster sulphonylurea 
receptor - Cricetus sp, 1582 
aa. [W09528411-A1, 
26-OCT-1995] 


1-1581 
1..1582 


1469/1582(92%) 
1506/1582(94%) 


0.0 


AAR77084 


Rat sulphonylurea receptor - 
Rattus sp, 1498 aa. 
[W09528411-A1. 
26-OCT-1995] 


1..1290 
1..1291 


1195/1291 (92%) 
1223/1291 (94%) 

■ 


0.0 
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AAR77085 



Hamster sulphonylurea 
receptor - Cricetus sp, 1498 
aa. [W09528411-A1, 
26-OCT-1995] 



1..1290 
1..1291 



CT/" U QQ a J 3 :IL 21 ? 

1186/1291 (91%) 0.0 



1220/1291 (93%) 



In a BLAST search of public sequence datbases, the NOV5 1 a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 51E. 



Table 51E.P 


ublic BLASTP Results for NOVSla 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOVSla 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q09428 


Sulfonylurea receptor 1 - 
Homo sapiens (Human), 1580 
aa. 


2.1581 
1..1580 


1579/1580 (99%) 
1579/1580 (99%) 


0.0 


Q09429 


Sulfonylurea receptor 1 - 
Rattus norvegicus (Rat), 1581 
aa. 


2..1581 
1..1581 


1512/1582 (95%) 
1536/1582 (96%) 


0.0 


Q09427 


Sulfonylurea receptor 1 - 
Cricetus cricetus 
(Black-bellied hamster), 1581 
aa. 


2..1581 
1..1581 


1498/1582 (94%) 
1530/1582 (96%) 


0.0 


A56248 


sulfonylurea receptor - golden 
hamster, 1582 aa. 


L.1581 
1.-1582 


1469/1582(92%) 
1506/1582 (94%) 


0.0 


Q95J92 


Sulphonylurea receptor 2B - 
Oryctolagus cuniculus 
(Rabbit), 1549 aa-. 


1..1580 
1..1548 


1076/1581 (68%) 
1277/1581 (80%) 


0.0 



PFam analysis predicts that the NOV51a protein contains the domains shown in the 
Table 51R 



Table 51F. Domain Analysis of NOVSla 


Pfam Domain 


NOV51a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


ABQjnembrane 


318..590 


53/287 (18%) 
212/287 (74%) 


3.6e^6 
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ABCjran 


706..905 


) PCT, 

55/214(26%) 

154/214 (72%) 




iJii::"/ 3:1 1!>?'. 
1.3e-34 


3 


ABC_membrane 


1011..1298 


58/292(20%) 
222/292 (76%) 


2.7e-51 




PRK 


1374..1391 


6/19 (32%) 
15/19 (79%) 


0.21 




ABCjran 


1371..1554 


54/199 (27%) 
129/199 (65%) 


5.7e-36 





Example 52. 

The NOV52 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 52A. 



Table 52A.NCW 


52 Sequence Analysis 




SEQIDNO:227 ' J 1404 bp - f 


NOV52a, 
CG93659-01 
DNA Sequence 


ATQGAGTACATGAGCACTGGAAGTGACAATAAAGAAGAGATTGATTTATTAATTAAACATTTAAATG 
TGTCTGATGTAATAGACAl^ATGGAAAATCTTTATGCAAGTGAAGAGCCAGCAGTTTATGAACCCAG 

tctaatgaccatgtgtcaagacagtaatcaaaacgatgagcgttctaagox:tctgctgcttagtggc 
caagaggtaccatggttgtcatcagtcagatatggaactgtggaggatttgcttgcttttgcaaacc 
atatatccaacactgcaaagcatttttatggacaacgaccacaggaatctggaattttattaaacat 
ggtg&tcactccccaaaatgg&cgttaccaaatagattc^ 

acttacaggaatattggttctgattttattcctcggggcgcctttggaaaggtatacttggctcaag 
atataaagacgaagaaaagaatggcgtgtaaactgatcccagtagatcaatttaagccatctgatgt 
ggaaattcaggcttgcttccggcacgagaacatcgcagagctgtatggcgcagtcctgtggggtgaa 
actgtccatctctttatggaagcaggcgagggagggtctgttctggagaaactggagagctgtggac 
caatgagagaatttcaaattatttgggtgacaaagcatgttctcaagggacttgattttctacactc 
aaagaaagtgatccatcatgatattaaacctagcaacattgttttcatgtccacaaaagctgttttg 

GTGGATTTTGGCCTAAGTGTTCAAATGACCGAAGATGTCTATTTTCCTAAGGACCTCCGAGGAACAG 

AGATTTACATGAGCCCAGAGGTCATCGTGTCCAGGGGCCATTCAACCAAAGCAGACATCTACAGCCT 

GGGGGCCACGCTCATCCACATGCAGACGGGCACCCCACCCTGGGTGAAGCGCTACCCTCGCTCAGCC 

TATCCCTCCTACCTGTACATAATCCACAAGCAAGCACCTCCACTGGAAGACATTGCA^ 

GTCCAGGGATGAGAGAGCTGATAGAAGCTTCCCTGGAGAGAAACCCCAATCACCGCCCAAGAGCCGC 

AGACCTACTAAAACATGAGGCCCTGAACCCGCCCAGAGAGGATCAGCCACGCTGTACGAGTCTGGAC 

TCTGC CC TCTTGGAGCGCAAGAGGC TGCTGAGTAGGAAGGAGC TGGAAC TTCC TGAGAACATTGCTG 

ATTCTTCGTGCACAGGAAGCACCGAGGAATCTGAGATGCTCAAGAGGCAACGCTCTCTCTACATCGA 

CCTCGGCGCTCTGGCTGGCTACTTCAATCTTGTTCGGGGACCACCAACX3CTTGAATATGGCTGA 




ORF Start: ATGat 1 j ORF Stop: TGA at 1402 





SEQIDNO:228 |467aa jMW at 52896.9kD 


NOV52a, 
CG93659-01 
Protein Sequence 


IffiYMSTGSDX^EIDLLIIOaNVSDVr^^ 

QEVPWLSSVRYGTVEDLLAFAOTI SNTAKHFYGQRPQESGILLNMVITPQNGRYQIDSDVLLI PWKI/ 
TYRNIGSDF I PRGAFGKVYLAQDIKTKKRMACKL IPVDQFKPSDVEIQACFRHENIAELYGAVLWGE 
TVHLFMEAGEGG S VLEKLES CGPMREFEI IWVTKHVXiKGLDFLHSKKVIHHDIKPSNI VFMSTKAVI/ 
VDFGLSVQMTEDVYFPKBLRGTEI YMS PEVILCRGHS TKADI YSLGATL IHMQTGTP PWVKRYPRS A 
YPSYLYI IHKQAPPLEDIADDCS PCTREL IEASLERNPNHRPRAADLIiKHEALNP PREDQ PRCTSLD 
SALLERKRLLSRKELELPEWIADSSCTGSTEESEMLKRQRSLYIBLGALAGYFN^ 
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]SEQE>N0:229 jl430bp f 


NOV52b, 
CG93659-03 
DNA Sequence 


CTGACACTGCACTGAGCACTTTATGAGCTTGAACTCTG'TTAATCCTCACGACCACCTCATC5AGArTr 


TCCAGAAAGAGCAACAGTAATGGAGTAC ATGAftn ACTGn A AnTG AC A AT A AAGAAGAGATTGATTT'*!. 
TTAATTAAACATTTAAATGTGTCTGATGTAATAGACATTATGGAAAATCTTTATGCAAGTGAAGAGC 
CAGCAGTTTATGAACCCAGTCTAATGACCATGTGTCAAGACAGTAATCAAAACGATGAGCGTTCTAA 
GTCTCTGCTGCTTAGTGGCCAAGA.GGTACCATGGTTGTC2VTCAGTCAGATA 

TTGCTTGC TT TTGC AAACCATATATCCAACACTGCAAAGCATTTTTATGGACAACGACC AC AGG AAT 
CTGGAATTTTATTAAACATGGTCATCACTCCCCAAAATGGACGTTACCAAATAGATTCCGATGTTCT 
CCTGATCCCCTGGAAGCTGACTTACAGGAATATTGGTTCTGATTTTATTTCTCX3GGGCGCCTTTGGA 
AAGGTATACTTGGCACAAGATATAAAGACGAAGAAAAGAATGGCGTGTAAACTGATCCCAGTAGATC 
AATTTAAGCCATCTGATGTGGAAATCCAGGCTTGCTTCCGGCACGAGAACATCGCAGAGCTGTATGG 
CGC AGTCC TGTGGGGTG AAACTGTCCATCTCTTTATGGAAGCAGGCGAGGGAGGGTC TGTTCTGGAG 
AAACTGGAGAGCTGTGGACCAATGAGAGAATTTGAAATTATTTGGGTGACAAAGCATGTTCTCAAGG 
GACTTGATTTTCTACACTCAAAGAAAGTGATCCATCATGATATAAAC^ 

CATCCTGTGCAGGGGCCATTCAACCAAAGCAGACATCTACAGCCTGGGGGCCACGCTCATCCACATG 
CAGACGGGCACCCCAC CCTGGGTGAAGCGCTACCCTCGCTCAGCCTATCCCTCCTACC TGTACATAA 
TC C ACAAGCAAGC ACC TC C AC TGG AAGAC AT TG C AGATGAC TG C AGTC C AGGGATG AGAGAGC TG AT 
AGAAGCTOCCCTGGAGAGAAACCCCAATCACCGCCCAAGAGCCGCAGACCTACTAAAACATGAGGCC 
C TGAACCCGCCCAGAGAGGATC AGCCACGCTGTCAG AG TCTGGACTCTGCC CTCTTGGAGCGC AAGA 
GGCTGCTGAGTAGGAAGGAGCTGGAACTTCCTGAGAACATTGCTGATTCTTCGTGCACAGGAAGCAC 
CGAGGAATCTGAGATGCTCAAGAGGCAACGCTCTCTCTACATCGACCTCGGCGCTCTGGCTGGCTAC 
TTCAATCTTGTTCG^GGACC^CCAACGCTTGAATATGGCTGAAGGATGCCATGTTTGCTCTAAATTA 
AGACAGCATTGATCTCCTGGAGG 




ORF Start: ATG at 87 j ]ORF Stop: TGA at 1380 
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SEQ ID NO: 230 }431 aa |mW at 48882.2kD 


NOV52b, 
CG93659-03 
Protein Sequence 


MEYMSTGSDNKEEIDI^IKHLIWSD^ 

QEVPWL S S VRYGTVEDIjLAFANHI SOTAKHFYGQRPQESG ILLNMVITPQNGR YQIDSDVLL I PWKL 
TYRNIGSDFI SRGAFGKVYLAQDIKTKKRMACKLI PVDQFKPSDVEIQACFRHENI AELYGAVLWGE 
TVHLFMEAGEGGSVI,EKXESCGPMREFEIIWVTKHVLKGL^ 

STKADI YSLGATLIHMQTGTPPWVKRYPRSAYPSYLYI IHKQAPPLEDIADDCSPGMREL I EASLER 

NPNKRPRAAI)LLKHEALNPPREDQPRCQSLDSALLERK^ 

KRQRSLYIDLGAL AGYFNLVRG PPTLEYG 
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SEQ ID NO: 231 jl538bp j 


NOV52c, 
CG93659-02 
DNA Sequence 


CTGAC AC TGC ACTGAGC AC TT T ATG AGCTTG AAC TC TGTT AATCC TC ACG AC C AC C TC ATG AG ACTC 


TCCAGAAAGAGCAACAGTAATGGAGTAPATVJAGrArT^r:aar='nnara am a a A fc^AGATTGATTTA 

TTAATT AAAC ATT TAAATGTGTC TGATGTAATAG AC ATTATGGAAAATC T TT ATG CAAGTGAAG AGC 

CAGCAGTTTATGAACCCAGTCTAATGACCATGTGTCAAGACAGTAATCAA^CGATCAGCGTTCTAA 

GTCTCTGCTGCTTAGTGGCCAAGAGGTACCATGGTTGTCATCAGT^ 

m;CTTGCTTTTGCAAACCATATATCCAA(^CTGCAAAGC^^ 

CTGGAATTTTATTAAACATGGTCATCACTCCCCAAAATQ 

CCTGATCCCCTGGAAGCTGACTTACAGGAATATTGGTTCTGATTTTATTTCTCGGGGCGCCTTTGGA 
AAGGTATACTTGGCACAAGATATAAAGACGAAGAAAAGAATGGCGTGTAAACTGATCCCAGTAGATC 
AATTTAAGCC ATCTGATGTGGAAATCCAG^TTGCTTCCGGC ACGAGAAC ATCGCAGAGC TGTATGG 
CGCAGTCCTGTGGGGTGAAACTGTCCATCrCTTTATGGAAGCAGGCGAGGGAGGGTCTGTTCTGGAG 

aaactggagagctgtggaccaatgagagaatttgaaattatttgggtgacaaagcatgttctcaagg 
gacttgattttctacactcaj^gaaj^tgatc^ 

gtccacaaaagc tgttttggtggattttggcctaagtgttcaaatgaccgaagatgtc tattttcct 
aaggacc tccgaggaac agagatttacatgagcccagaggtcatc ctgtgcagtggccattcaacca 

AAGCAGACATCTACAGCCTGGGGGCCACGCTCATCCACATGCAGACGGGCACCCCACCCTGGGTC^ 
GCGC TACCCTCGCTCAGCCTATCCCTCCTACC TGTACATAATCCACAAGCAAGCACCTCCACTGGAA 
GACATTGC AG ATG ACTGC AG TCC AGGGATGAGAGAGCTGATAGAAGCTTCC CTGGAGAGAAACCCCA 
ATCACCG CCCAAGAGC CGCAGAC CTACTAAAAC ATGAGGCCCTGAACCCGCCCAGAGAGG ATCAGCC 
ACGCTGTCAGAGTCTGGACTCTGCCCTCTTGGAGCGCAAGAGGCTGCTGAGTAGGAAGGAGCTGGAA 
CTTCCTGAGAACATTGCTGATTCTTCGTGCACAGGAAGCACCGAGGAATCTGAGATGCTCAAGAGGC 
AACGCTCTCTCTACATCGACCTCGGCGCTCTGGCTGGCTACTTCAATCTTGTTCGGGGACCACCAAC 
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GCTTGAAIATGGCTGAAGGATGCCATGTO'TGCTCT 



ORF Start: ATG at 87 



ORFStop: TGA at 1488 





SEQIDNO:232 |467aa MW at 52844 JkD 


NOV52c, 
CG93659-02 
Protein Sequence 


MEYMSTGSDMKEEIDLLIKHIiNVSDVIDIM^ 

QEVPWLSSTOYGTVEDLLAFANHISNTAKHFYGQRPQESG ILLNMVITPQNGRYQIDSDVLLI PWKL 

TYRNIGSDFI SRGAFGKVYLAQD IKTKKRMACKL I P VDQFKP SDVEIQACFRHENI AE1»YGAVIjWGE 

TVHLFMEAGEGGSVIiEKLESCGPMREFEIIWVTKHVL^^ 

VDFGLSVQMTEDVYFPKDLRGTEIYMSPEV^ 

YPSYLYIIHKQAPPLEDO^DCSPGMRE^ 

SALLERXRL&SRKELELPENIADSSCTGSTEESEML^ 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 52B. 
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Table 52B. Comparison of NOV52a against NOV52b and NOV52c. 


Protein Sequence 


NOV52a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV52b 


1..467 
1..431 


413/467 (88%) 
413/467 (88%) 


NOV52c 


1..467 
1..467 


449/467(96%) 
449/467(96%) 



Further analysis of the NOV52a protein yielded the following properties shown in 
15 Table 52C. 



Table SIC. Protein Sequence Properties NOV52a 


PSort analysis: 


0.6500 probability located in cytoplasm; 0.1000 probability located in 
mitochondrial matrix space; 0.1000 probability located in lysosome (lumen); 
0.0000 probability located in endoplasmic reticulum (membrane) 


SignalP analysis: 


No Known Signal Sequence Predicted 



20 A search of the NOV52a protein against the Geneseq database, a proprietary 

database that contains sequences published in patents and patent publication, yielded 
several homologous proteins shown in Table 52D. 
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Table 52D. Geneseq Results for NOV52a 


Geneseq 
laeijuiier 


Protein/Organism/Length 
IJratent ff, uatej 


NOV52a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Region 


Expect 
Value 


AAE05951 


Human cot oncoprotein 
encoded by D14497 
oncogene - Homo sapiens, 
467 aa. [US6265216-B1, 
24-JUL-2001] 


L.467 
1..467 


467/467 (100%) 
467/467 (100%) 


0.0 


A A V7QO/M 


Jtluman kaji - Homo 
sapiens, 467 aa. 
[WO200011191-A2, 
02-MAR-2000] 


1..467 
1..467 


461 nol (100%) 
467/467 (100%) 


0.0 


AAE10313 


T-TnTnan Tn 1 ^ "nrr»t*»in — Wpitth'\ 
XlUillOJl x\jXJu piULCIU XXUUlO 

sapiens, 467 aa. 

[WO200166559-A1, 

13^SEP-2001] 


1..467 


'rOO/ *f O / \yy 10 ) 

4667467 (99%) 


u.u 


AAE10314 


Rat Tpl2 protein - Rattus sp, 
467 aa. [WO200166559-A1, 
13-SEP-2001] 


1..467 
1..467 


439/467 (94%) 
454/467 (97%) 


0.0 


AAY79243 


Rat TPL-2- Rattus 
norvegicus, 467 aa. 
[WO200011191-A2, 
O2-MAR-2O00] 


1..467 
1..467 


438/467 (93%) 
453/467 (96%) 


0.0 



In a BLAST search of public sequence datbases, the NOV52a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 52E. 



Table 52E. Public BLASTP Results for NOV52a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV52a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


P41279 


Mitogen-activated protein 
kinase kinase kinase 8 (EC 
2.7. 1 (COT proto-oncogene 
serine/threonine-protein 
kinase) (C-COT) (Cancer 
Osaka thyroid oncogene) — 
Homo sapiens (Human), 467 
aa. 


1..467 
1..467 


467/467 (100%) 
467/467 (100%) 


0.0 
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A48713 



serine/threonine-specific 
protein kinase cot, 58K form - 
human, 467 aa. 



1..467 
1..467 



466/467 (99%) 0.0 | 
466/467 (99%) 



Q63562 



A41253 



Mitogen-activated protein 
kinase kinase kinase 8 (EC 
2.7. 1 .-) (Tumor progression 
locus2)(TPL-2)-Rattus 
norvegicus (Rat), 467 aa. 



1..467 
1..467 



Q07174 Mitogen-activated protein 
kinase kinase kinase 8 (EC 
2.7.1.-) (COT proto-oncogene 
serine/threonine-protein 
kinase) (C-COT) (Cancer 
Osaka thyroid oncogene) - 
Mus musculus (Mouse), 467 
aa. 



1..467 
1..467 



kinase-related transforming 
protein (EC 2.7.1.-) - human, 
415 aa. 



1..397 
1..397 



438/467(93%) 
453/467(96%) 



435/467(93%) 
454/467 (97%) 



379/397 (95%) 
379/397 (95%) 



0.0 



0.0 



0.0 



PFam analysis predicts that the NOV52a protein contains the domains shown in the 
Table 52F. 



Table 52F. Domain Analysis of NO V52a 


Pfam Domain 


NOV52a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


pkinase 


146.388 


74/279(27%) 
187/279 (67%) 


4.7e-54 



Example 53. 

The NOV53 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 53 A. 



Table 53A. NOV53 Sequence Analysis 

|SEQ]DNO:233 |l078bp 
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" 1 • : J-' 1 ' L.. " ,■' " ' n ii"" ■•' ' ->' -n ~:ni - 

CCOOAGQGGCXQAAACTCATTTCTOJWCAT^ 

AATGGAGAACGGCCTTCTCTTCAAAGAACTTCTGCAGACTCCAAATTTTCGAATTACGGTGGTTGAT 

GATGCAGACACTGTTG AACTC TGTGGTGC GC TTAAGAACATCGTAGCTGTGGGAGCTGGGTTCTGCG 

ACGGCCTCC(^TGTGGAGACAACACCAAAGCGGCCGTCATCCGCCTGGGACTCATGGAAATGATTGC 

TTTTGCCAGGATCTTCTGCAAAGGCCAAGTGTCTACAGCCACCTTCCTAGAGAGCTGCGGGGTGGCC 

GACCTGATCACCACCTGTTACGGAGGGCGGAACCGCAGGGTGGCCGAGGCCTTCGCCAGAACTGGGA 

AGACCATTGAAGAGTTGGAGAAGGAGATGCTGAATGGGCAAAAGCTCCAAGGACCGCAGACTTCTGC 

TGAAGTGTACCGCATCCTCAAACAGAAGGGACTACTGGACAAGTTTCCATTGTTTACTGCAGTGTAT 

CAGATCTGCTACGAAAGCAGACCAGTTCAAGAGATGTTGTCTTGTCTTCAGAGCCATCCAGAGCA 
CATAAA 




ORFStart:ATGat22 j JoRF Stop: TAA at 1075 " 
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SEQ ID NO: 234 (351 aa MW at 38418.3kD 


NOV53a, 
CG94521-01 
Protein Sequence 


MAAAPIiKVC I VGSGNWGSAVAKI IGNNVKKLQKFAS WKMWFEETVNGRKLTDI INNDHENVKYLP 
GHKLPEI^AMSNLSEAVQDADLLVFVIPHQFIHRICDEITGRVPKKALGITLIKGIDEGPEGLKLT 
SDI IREKMG IDIS VLMGANIANEVAAEKFCETTIGSKVMENGLLFKELLQTPNFRITVVDDADTVEL 
CGALKNIVAVGAGFCDGLRCGDOTKAAVIRLGLMEMIAFARIFCKGQVSTATFLESCGVADLITTCY 
GGRNRRVAEAFARTGKTIEELEKEMLNGQKLQG 





SEQ ID NO: 235 |936bp ~f 


NOV53b, 
CG94521-03 
DNA Sequence 


tcagctgttgcaaaaataattggtaa^ 
tgtgottctttgaagaaacagtgaatgg^^ 
tctaaaatatcttccto^ 
aagctcatttctgacatcatccgtgagaagat^ 

ttgccaatgaggtggctgcagagaagttctgtgagaccaccatcggcagcaaagtaatggagaacgg 
ccttctcttcaaa^ 

gttgaactctgtggtgcgct^gaacatcgtagctgtcx^agctgggttctgcgacggcctccgct 
gtggagacaaca.ccaaagcggccgtcatccgcctgggactcatggaaatgattgcttttgccaggat 
cttctgcaaaggccaagtgtctacagccaccttcctagagagctgcggggtggccgacctgatcacc 

agttcgagaaggagatgctgaatgggcaaaagctccaaggaccg^gacttctgctgaagtgtaccg 
catcctcaaacagaagggactactggacaagtttccattgtttactgcagtgtatcagatctgctac 
gaaagcagaccagttcmgagatgttgtcttc^ 


jORFStart:ATGatl7 | JoRF Stop: TAA at 929 





SEQ ID NO: 236 304 aa jjyrw at 33235.2kD 


NOV53b, 
CG94521-03 
Protein Sequence 


maaapijcvcivgsgnwgsavakiignnvkklqkfa^ 

r^r^T^ ^*^^^EGPEGLKLI SDI IREKMG IDISVLMGANIANEVAAEKFCETTIGSKVMENGIjLFKE 
VSTATFLESCGVADLITTC^ 

GLLDKFPLFTAVYQICYESRPVQEMLSCLQSHPEHT www v xixur^jx 



I |SEQIDNO:237 |l077"b^ 
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NOV53c, 
CG94521-02 
DNA Sequence 


TACATTCGGCCCGGCCATGGCAGCGGCGTC^ 

TC AGCTGTTGC AAAAAT AATTGGTAATAATGTC AAGAAACTTC AG AAATTTG CC TCCAC AGTCAAG A 
TGTGGGTCTTTGAAGAAACAGTGAATGGCAGAAAACTGACAGACATCATAAATAATGACCATGAAAA 
TGTAAAATATCTTCCTGGACACAAGCTGCCAGAAAATGTGGTTGCCATGTCAAATCTTAGCGAGGCT 
GTGCAGGATGCAGACCTGCTGGTGTTTGTCATTCCCCACCAGTTCATTCACAGAATCTGTGATGAGA 
TCACTGGGAGAGTGCCCAAGAAAGCGCTGGGAATCACCCTCATCAAGGGCATAGACGAGGGCCCCGA 
GGGGCTGAAGCTCAT TTCTGACATC ATCCGTGAGAAGATGGGTATTGACATCAGTGTGC TGATGGGA 
GCC AACATTGCCAATGAGGTGGC TGCAGAGAAGTTC TGTG AGACC ACCATCGGC AGCAAAGTAATGG 
AGAACGGCCTTCTCTTCAAAGAACTTCTGCAGACTCCAAATTTTCGAATTACCGTGGTTGATGATGC 
AGAC AC TGTTGAACTCTG TGGTG CGC TTAAGAAC ATCGTAGCTGTGGG AGCTGGGTTCTGCGAC GGC 
CTCCGCTGTGGAGACAACACCAAAGCGGCCGTCATCCGCCTGGGACTCATGGAAATGATTGCTTTTG 
CCAGGATCTTCTGCAAAGGCCAAGTGTCTACAGCCACCTTCCTAGAGAGCTGCGGGGTGGCCGACCT 
GATCACCACCTGTTACGGAGfGGCGGAACCGCAGGGTGGCCGAGGCCTTCGCCAGAACTGGGAAGACC 
ATTGAAGAGTTGGAGAAGGAGATGCTGAATGGGCAAAAGCTCCAAGGACCGCAGACTTCTGCTGAAG 
TGTACCGC ATCCTC AAAC AGAAGGGACTAC TGGACAAGTTTCC ATTGTTTAC TGC AGTGTATCAGAT 
CTG C TACGAAAGC AG ACC AGTTC AAGAGATGTTGTCT TGTC TTCAGAGCCATCCAGAGCATAC ATAA 
AAAGG 




ORF Start: ATG at 17 j ORF Stop: TAA at 1070 





SEQIDNO: 238 |351 aa jMWat 38418.3kD 


NOV53c, 
CG94521-02 
Protein Sequence 


MAAAPLKVC I VGSGNWGS AVAK I IGNNVKKLQKFASTVKMWVFEETVNGRKLTDI INNDHENVK YI/P 
GHKI* PENWAMSNL SEAVQBADLL VFVI PHQFI HRICDEI TGRVPKKALG I TL IKGIDEGPEGLKLI 
SDI IREKMGIDISVLMGAWIANEVAAEKFCETTIGSKVMENGLLFKELLQT PNFRITWDDADTVEL 
CGALKNI VAVGAGFCDGLRCGDNTKAAVIFUjGrjMEMIAFAR I FCKGQVSTATFLE SCGVADL I TTC Y 
GGRNRRVAEAFARTGKTIEELEKEMLNGQ^ 



Sequence comparison of the above protein sequences yields the following sequence 
relationships shown in Table 53B. 



Table 53B. Comparison of NOV53a against NOV53b and NOV53c. 


Protein Sequence 


NOV53a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV53b 


1..351 
1..304 


304/351 (86%) 
304/351 (86%) 


NOV53c 

• 


1..351 
1..351 


351/351 (100%) 
351/351 (100%) 



Further analysis of the NOV53a protein yielded the following properties shown in 
Table 53C. . 



Table 53C. Protein Sequence Properties NOV53a 
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PSort analysis: 



^probability located in cytoplasm; £^SgSS*»B 
mitochondrial matrix space; 0.1000 probability located in Iysosome (lumen) : 
0.0000 probability located in endoplasmic reticulum (membrane) 



Signal? analysis: 



Cleavage site between residues 22 and 23 



A search of the NOV53a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
5 several homologous proteins shown in Table 53D. 



1 Table 53D. Gei 


leseq Results for NOV53a 


I Geneseq 
(Identifier 


Protein/Organism/Lengtli 
[Patent #, Date] 


NOV53a 
Residues/ 
Match 
Residues 


identities/ 
Similarities for 

the lYTnfrhprl 

Region 


Expect 
Value 




Drosophila melanogaster 
polypeptide SEQ ID NO 
19344 -Drosophila 
melanogaster, 360 aa. 
L W O200 171042-A2, 
27-SEP-2001] 


3..350 
2..349 


212/349 (60%) 
263/349 (74%) 


e-120 


AAG08446 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 5988 - 
Arabidopsis thaliana, 366 aa. 
[EP1033405-A2, 
06-SEP-20001 


7.331 
22.349 


180/329 (54%) 
233/329 (70%) 


8e-95 


J AAG08445 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 5987 - 
Arabidopsis thaliana, 400 aa. 
[EP1033405-A2, 
06-SEP-2000] 


7.331 
56.383 


180/329 (54%) 
233/329 (70%) 


8e-95 


JAAG08444 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 5986 - 
Arabidopsis thaliana, 421 aa. 
[EP1033405-A2, 
06-SEP-2000] 


7.331 
77..404 


180/329 (54%) 
233/329 (70%) 


8e-95 


AAG39422 

|_ i 


Arabidopsis thaliana protein 
fragment SEQ ID NO: 48774 
- Arabidopsis thaliana, 366 
aa. [EP1033405-A2, 
36-SEP-2000] 


7.331 
22.349 


180/329(54%) 
232/329(69%) 


le-94 
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In a BLAST search of public sequence datbases, the ^d^&ffiUK^&s^Mti&J 3 
have homology to the proteins shown in the BLASTP data in Table 53E. 



Table 53E. Public BLASTP Results for NOV53a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV53a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


AAH28726 


KIAA0089 protein - Homo 
sapiens (Human), 351 aa. 


1..351 
1..351 


351/351 (100%) 
351/351(100%) 


0.0 


Q14702 


KIAA0089 protein - Homo 
sapiens (Human), 411 aa 
(fragment). 


1..351 
61.411 


351/351 (100%) 
351/351 (100%) 


0.0 


057656 


GlyceroI-3-phosphate 
dehydrogenase [NAD+], 
cytoplasmic (EC 1.1.1.8) 
(GPJ>C)(GPDH-Q-Fugu 
rubripes (Japanese 
pufferfish) (Takifogu 
rubripes), 351 aa. 


3..350 
2. .350 


265/349 (75%) 
306/349 (86%) 


e-155 


Q98SJ9 


Glycerol-3-phosphate 
dehydrogenase (EC 1.1.1.8) - 
Salmo salar (Atlantic 
salmon), 350 aa. 


7..350 
5..349 


258/345 (74%) 
301/345 (86%) 


e-152 


AAH32234 


Glycerol-3-phosphate 
dehydrogenase 1 (soluble) - 
Homo sapiens (Human), 349 
aa. 


4..350 
2..348 


249/347(71%) 
297/347(84%) 


e-149 



PFam analysis predicts that the NOV53a protein contains the domains shown in the 
Table 53F. 
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Table 53F. Domain Analysis of NO V53a 


Pfam Domain 


NOV53a Match 
Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


NADJ31y3P_dh 


S..344 


167/365(46%) 
307/365 (84%) 


2.1e-184 
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Example 54. 

The NOV54 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown in Table 54A. 



Table 54A. NOV54 Sequence Analysis 




oJCV^ lxJ INvA Z.oy J13JZ Dp I 


NOV54a, 

LAjyODlJ-UI 

DNA Sequence 


TTATTCCCCACTTTACCTGGCTAATTGAAGTGTAACAAAAGCTTCATCCAGGAACATTGGCGCGGGA 


AACCTGGCGTACTGGCTGTGGCTTCTCTAGCGGGACTCGGCATGAGGCTGGCGCGGCTGCTTCGCGG 


AGCCGCCTTGGCCGGCCCGGGCCCGGGGCTGCGCGCCGCCGGCTTCAGCCGCAGCTTCAGCTCGGAC 
TCGGGCTCCAGCCCGGCGTCCGAGCGCGGCGTTCCGGGCCAGGTGGACTTCTACGCGCGCTTCTCGC 
CGTCCCCGCTCTCCATGAAGCAGTTCCTGGACTTCGGATCAGTGAATGCTTGTGAAAAGACCTCATT 
TATGTTTC TG CGGC AAGAGTTGCC TG TCAGACTGGC AAATATAATGAAAGAAAT AAGTC TCC TTC CA 
GATAATCTTCTCAGGACACG&TCCGTTCAATTGGTAC 

TTCTTGATTTTAAGGACAAAAGTGCTGAGGATGCTAAAGCTATTTATGACTTTACAGATACTGTGAT 

ACGGATCAGAAACCGACACAATGATGTCATTCCCACAATGGCCCAGGGTGTGATTGAATACAAGGAG 

AGCTTTGGGGTGGATCCTX5TCACCAGCCAGAATGTTCAGTACTTTTTGGATCGATTCTACATGAGTC 

GCATTTCAATTAGAATGTTACTC^TCAGCACTCTTTATTGTTTGGTGGAAAAGGCAAAGGAA 

ATCTCATCGAAAACACATTGGAAGCATAAATCCAAACTGCAATGTACTTGAAGTTATTAAAGATGGC 

TATGAAAATGCTAGGCGTCTGTGTGATT TGTATTATAT TAACTC TCC CGAACTAGAACTTGAAGAAC 

TAAATGCAAAATCACCAGGACAGCCAATACAAGTGGTTTATCTACCATCCCATCTCTATCACATG 

GTTTGAACTTTTCAAGAATGCAATCAGAGCCACTATGGAA(^CCATGCC^C^GAGGTCTTTACCCC 

CCTATTCAAGTTCATGTCACGCTGGGTAATGAGGATTTGACTGTGAAGATGAGTGACCGAGGAGGTG 

GCGTTCCTTTGAGGAAAATTGACAGACTTTTCAACTACATCTATTCAAC0X3CACCAAGACCTCGTGT 

TGAGACCTCCCGCGC^TGCCTCTGGCTGGTrTTGGTTAlH3GATTGC^ 

CAATACTTCCAAGGAGACCTGAAGCTGTATTCCCTAGAGGGTTACGGGACAGATGCAGTTATCTACA 
TTAAGGCTCTGTCAAC^GACTCAATAGAAAGACTCCCAGTGTATAACAAAGCTGCCTGGAAGCATTA 
CAACACC^CCACGAGGCTGATGACTGGTGCGTCCCCAGCAGAGAACCCAAAGACATGACGACGTTC 
CGCAGTGCCTA6ACACACTGGGGACATCGGAAAATCCAAATGTGGCTTTTGTATTAAATTTGGAAGG 
TATGGTGTTCAGAACTATATTATACCAAGTACTTTATTTATCGTTTTCACAAAACTATTTGAGTAGA 








ORF Start: ATG at 109 } |ORF Stop: TAG at 1417 





SEQIDNO:240 |436aa 


MWat49243.6kD 


NOV54a, 
CG96613-01 
Protein Sequence 


MRLARLLRGAAItAGPGPGLRAAGFSRSFSSDSGS SPASERGVPGQVDFYARFS PSPLSMKQFLDFGS 
VNACEKTSFMFLRQELPVRLANIMKEISLL^ 

I YDFTDTVI R I RNRHNDVT PTMAQGVI EYKESFGVD PVTSQNVQYFLDRF YMSRIS IRMLLNQHSLI* 
FGGKGKGS PSHRKH IGS INPNCNVL BVTKDGYENARRLCDLYYINS PELKLEELNAKSPGQPIQWY 
VPSHLYHM^ELFKNAMFJmiEHHANRGWPPI 

YSTAPRPRVETSRAVPLAGFGYGL PI SRL YAQYFQGDLKLYSLEGYGTDAVI YIKAL STDS I ERLPV 
YNKAAWKHYNTNHEADDWCVPSREPKDMTTFRSA 





SEQ ID NO: 241 


1612 bp | 


NOV54b, 
CG96613-03 . 
DNA Sequence 


TTAT TCCCC ACTTTAC C TGGCTAATTGAAGTGTAAC AAAAGCTTC ATCCAGGAACATTGGCGC GGGA 


AACCTGGCGTACTGGCTGTGGCTTCTCTAGCGGGACTCGGCATGAGGCTGGCGCGGCTGCTTCGCGG 


agccgccttggccggcccgggcccggggctgcgcgccgccggcttcagccgcagcttcagctcggac 
tcgggctccagcccggcgtccgagcgcggcgttccgggccaggtggacttctacgcgcgcttctcgc 
cgtccccgctctccatgaagcagttcctggacttcggatcagtgaatgcttgtgaaaagacctcatt 
tatgtttctgcggcaagagttgcctgtcagactggcaaatataatgaaagaaataagtctccttcca 
gataatcttctcaggacaccatccgttcaattggtacaaagctggtatatccagagtcttcaggagc 
ttcttgattttaaggac?uv^gt:gctgaggatgctaaagctatttatgaaaggcctagaagaacatg 

GTTGCAGGTCTCTAGTTTATGCTGTATGGCCTGCAAGATGATCTTTACAGATACTGTGATACGGATC 
AGAAACCGACACAATGATGTCATTCCCACAATGGCCCAGGGTGTGATTGAATACAAGGAGAGCTTTG 
GGGTGGATCCTGTCACCAGCCAGAATGTTCAGTACTTTTTGGATCGATTCTACATGAGTCGCATTTC 
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AATTAGAATGTTACTCAA.TCAGCACTCT^ 

CGAAAACACATTGGAAGCATAAATCC^^CTGCAATGTACTTGAAGTTATTAAAGATGGCTATGAi^ 
ATGCTAGGCGTCTGTGTGATTTGTATTATATTAACTCTCCCGAACTAGAACTTGAAGAACTAAATGC 
AAAATCACCAGGACAGCCAATACAAGTGGTTTATGTACCATCCCATCTCTATCACATGGTGTTTGAA 
CTTTTCAAGAATGCAATGAGAGCCACTATGGAACACCATGCCAACAGAGGTGTTTACCCCCCT^TTC 
AAGTTCATGTCACGCTGGGTAATGAGGATTTGACTGTGAAGATGAGTGACCGAGGAGGTGGCGTTCC 
TTTGAGGAAAATTGACAGACTTTTCAACTACATGTATTCAACTGCACCAAGACCTCGTGTTGAGACC 
TCCCGCGCAGTGCCTCTGGCTGGTTTTGGTTATGGATTGCCCATATCACGTCTTTACGCACAATACT 
TCCAAGGAGACCTGAAGCTGTATTCCCTAGAGGGTTACGGGACAGATGCAGTTATCTACATTAAGGC 
TC TGTCAACAGACTCAATAGAAAGACTCCC AGTGTATAAC AAAGC TGCC TGGAAGCATTACAACACC 
AACCACGAGGCTGATGACTGGTGCGTCCCCAGCAGAGAACCCAAAGACATGACGACGTTCCGCAGTG 
CCTAGACACACTGGGGACATCGGAAAATCCAAATGTCGCTTTTGTATTAAATTTGGAAGRTATflRTV-' 


TTCAGAACTATATTATACCAAGTACTTTATTTATCGTTTTCACAAAACTATTTRARTaca mi 
GAAA " — 




ORF Start: ATG at 109 | jORF Stop: TAG at 1477 





SEQ ID NO: 242 ]456 aa |MW at 5 1622.6kD 


NOV54b, 
CG96613-03 
Protein Sequence 


KRL ARLLRGAALAGPGPGLRAAGFSRS FSSDSGSS PASERGVPGQVDF YARFS PS PLSMKQFLDFGS 
VNAC EKT SFMF LRQEL PVRX ANIMKE I SIOO PDNLLRT PSVQ L VQSWY I Q S LQELIjDFKDK S AEDAKA 
IYERPRRTWLQVSSLCCMACKMI FTDTVIR IRNRHNDVI PTMAQGVI EYKESFG VDPVTSQNVQYFL 
DRFYl^RISIRMLIiNQKSLLFGGKGKGSPSHRKHIGSINPNCNVLEVIKDGYENARRLCDLYY 
ELELEELNAKS PGQPIQVVYVPSHLYHIWFELFKNAMRATMEHHANRGVY PP IQVHVTLGNEDLTVK 

MSDRGGGVPI^RKIDRLFNYMYSTAPRPRVETSRAVPIAGFGYGLPISRLYAQYFQGDLKLYSLEGYG 
TD AVIYIKAL S TDSIERL PVYNKAAl\JKHYNTNHEADDWCVP SR E PKDMT T PR S A 





SEQ ID NO: 243 |967bp \ 


NOV54c, 
CG96613-02 
DNA Sequence 


TTATTCCCCACTTTACCTX5GCTAATTGAAGTGTAACAAAAGCTTCATC 
AACCTGGCGTACTGGCTGTGGCTTCTCTAGCG^^ 

AGCCGCCTTGGCCGGCCCGGGCCCGGGGCTGCGCGCCGCCGGCTTCAGCCGCAGCTTCAGCTCGGAC 
TCGGGCTCCAGTCCCGGCGTCCGAGCGCGGCGTTCCGGGCCAGGTGGACTTCTACGCGCGCTTCTCGC 
CGTCCCCGCTCTCCATGAAGCAGTTCCTGGACTTCGGATCAGTGAATGCTTGTGAAAAGACCTCATT 
TATGTTTC TGCGGCAAGAGT TGCC TGTCAGACTGGCAAATATAATGAAAGAAATAAGTC TCCTTCCA 
GATAATCTTCTCAGGACACCATCCGTTCAATTGGTACAAAGCTGGTATATCCAGAGTCTTCAGGAGC 
TTCTTGATTI^AAGGACAAAAGTGCTGAGGATGCTAAAGCTATTTATGAAAGGCCTAGAAGAACATG 
GTTGCAGGTCTCTAGTTTATGCTGTATGGCCTGCAAGATGATCTTTACAGATACTGTGATACGGATC 
AGAAACCGACACAATGATGTCATTCCCACAATGGCCCAGGGTGTGATTGAATACAAGGAGAGCTTTG 
GGGTGGATCCTGTC^CCAGCCAGAATGTTCAGTACTTTATTTATCGTTTTCACAAAACTATTTCAGT 
AGAATAAATGGAAACTGAATTCCATTTGTGCC^^ 

TTACACC TATAT TTTCACAG TTAATTG AAC AT ATTTT T AAACAAC TGTAGT T T TGGGC AAC TTTTr A 




AAAAAAATCCTTTCTTTTTTGTGGGC TAG 

ORF Start: ATG at 109 J JORF Stop: TGA at 733 





SEQ ID NO: 244 208 aa JmW at 23483.8kD 


NOV54c, 
CG96613-02 
Protein Sequence 


MRLAraLRGAALAGPGPGLRAAGFSRSFSSDSGSSPASERGVPGQVDFYARFSPSPLSMKQFLDFGS 

VNACEKTSFMFLRQELPVRLANIMKEISLLPDNLLRTPSVQLVQSOTIQSLQELLDFKDKSAEDAKA 

IYERPP^TWLQVSSLCCMACKmiFTDTV'IRIRNRHNDVIPTMAQGVIEYKESFGVDPVTSQNVQYFI 
YRFHKTI 
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Sequence comparison of the above protein sequences~yields the following sequence 
relationships shown in Table 54B . 



Table 54B. Comparison of NOV54a against NOV54b and NOV54c 


Protein Sequence 


NOV54a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV54b 


42..436 
42..456 


394/415 (94%) 
395/415 (94%) 


NOV54c 


42..185 
42..205 


140/164 (85%) 
143/164 (86%) 



5 

Further analysis of the NOV54a protein yielded the following properties shown in 
Table 54C. 

10 



Table 54C. Protein Sequence Properties NOV54a 


PSort analysis: 


0.4251 probability located in mitochondrial matrix space; 0.3802 probability 
located in microbody (peroxisome); 0.1914 probability located in lysosome 
(lumen); 0.1017 probability located in mitochondrial inner membrane 


SignalP analysis: 


Cleavage site between residues 22 and 23 



A search of the NOV54a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
15 several homologous proteins shown in Table 54D. 



Table 54D. Geneseq Results for NOV54a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV54a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABG16621 


Novel human diagnostic 
protein #16612 - Homo 
sapiens, 415 aa. 
[WO200175067-A2, 
ll-OCT-2001] 


42..43S 
21..413 


269/395 (68%) 
331/395 (83%) 


e-162 
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ABB58044 


Drosophila melanogaster 
polypeptide SEQ ID NO 924 
- Drosophila melanogaster, 
413 aa. [WO200171042-A2, 
27-SEP-2001] 


26..420 
2..396 


JT.:f/U5Da 

219/401 (54%) 
288/401 (71%) 


/."-if "11 "3["7 
e-121 


AAE07838 


Maize pyruvate 
dehydrogenase kinase 
(PDK)-2 - 2«a mays, 364 aa. 
[US6265636-B1, 
24-JUL-2001] 


40..401 
8..364 


144/374(38%) 
211/374(55%) 


2e-60 


AAW64724 


A. thaliana PDHK protein 
from clone YA5 - 
Arabidopsis thaJiana, 366 aa. 
[WO9835044-A1, 
13-AUG-1998] 


57..401 
29.-366 


142/357 (39%) 
209/357(57%) 


3e-58 


AAE07837 


Maize pyruvate 
dehydrogenase kinase 
(PDKM-Zeamays, 347 aa. 
[US6265636-B1, 
24-JUL-2001] 


40..401 
8..347 


135/371 (36%) 
205/371 (54%) 


4e-56 



In a BLAST search of public sequence datbases, the NOV54a protein was found to 
have homology to the proteins shown in the BLASTP data in Table 54R 



Table 54E.Pu 


iblic BLASTP Results for NOV54a 


Protein 

Accession 

Number 


Protein/Organism/Length 


NOV54a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for the 
Matched Portion 


Expect 
Value 


Q15118 


[Pyruvate dehydrogenase 
[lipoamide]] kinase isozyme 
1, mitochondrial precursor 
(EC 2.7.1.99) (Pyruvate 
dehydrogenase kinase isoform 
1) - Homo sapiens (Human), 
436 aa. 


1.436 
1..436 


436/436 (100%) 
436/436 (100%) 


0.0 


Q63065 


[Pyruvate dehydrogenase 
[lipoamide]] kinase isozyme 
1, mitochondrial precursor 
(EC 2.7.1.99) (Pyruvate 
dehydrogenase kinase isoform 
1)(PDKP48)-Rattus 
norvegicus (Rat), 434 aa. 


1..436 
1..434 


402/436 (92%) 
412/436 (94%) 


0.0 
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IQ8R2U8 


Similar to pyruvate 
dehydrogenase kinase, 
isoenzyme 1 - Mus musculus 
(Mouse), 432 aa. 


T— 1 

1..436 

1..432 


?G-F/T'SDi3 
401/436 (91%) 

412/436 (93%) 


0.0 


Q15I19 


[Pyruvate dehydrogenase 
[lipoamide]] kinase isozyme 
2, mitochondrial precursor 
(EC 2.7.1.99) (Pyruvate 
dehydrogenase kinase isoform 
2) - Homo sapiens (Human), 
407 aa. 


37..434 
11..405 


277/398 (69%) 
340/398 (84%) 


e-168 


170159 


[pyruvate dehydrogenase 
(lipoamide)] kinase (EC 
2.7.1.99) 2 - human, 407 aa. 


37..434 

11..405 J 


276/398 (69%) 
340/398 (85%) 


e-168 
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PFam analysis predicts that the NOV54a protein contains the domains shown in the 
Table 54F. 
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Table 54R Domain Aj 


aalysisofNOV54a 


Pfam Domain 


NOV54a Match Region 


Identities/ 
Similarities 
for the Matched 
Region 


Expect Value 


HATPase_c 


268.393 


32/134 (24%) 
84/134(63%) 


8.5e-20 



Example 55. 

The NOV55 clone was analyzed, and the nucleotide and encoded polypeptide 
sequences are shown i n Table 55A. 



Table 55A. NCW 


55 Sequence Analysis 


NOV55a, 
CG96736-01 
DNA Sequence 


SEQIDNO:245 |2885bp | 

^^^^^^^^^^^^^ 




g^C*CTCAA^CTCCW3AGCCAflGra^^ 



> 


GC^GCCGGCGTGKSCGC^^ 

GCGCTTGAGGCCTTCGTCTTCCCGGGCGAGCTGCTGCTGCGTCTGCTGCGGATGATCATCTTGCCGC 
TGGTGGTGTGCAGCTTGATCGGCGGCGCCGCCAGCCTGGACCCCGGCGCGCTCGGCCGTCTGGGCGC 
CTGGGCGCTGCTCTTTTTCCTGGTCACCACGCTGCTGGOGTCGGCGCTCGGAGTGGGCTTGGCGCTG 
GCTCTGCAGCCGGGCGCCGCCTCCGCCGCCATCAACGCCTCCGTGGGAGCCGCGGGCAGTGCCGAAA 
ATGCCCCCAGCAAGGAGGTGCTCGATTCGTTCCTGGATCTTGCGAGAAATATCTTCCCTTCCAACCT 
GGTGTCAGCAGCCTTTCGCTCATACTCTACCACCTATGAAGAGAGGAATATCACCGGAACCAGGGTG 
AAGGTGCCCGTGGGGCAGGAGGTGGAGGGGATGAACATCCTGGGCTTGGTAGTGTTTGCCATCGTCT 
TTGGTGTGGCGCTGCGGAAGCTGGGGCCTGAAGGGGAGCTGCTTATCCGCT'TCTTCAACTCCTTCAA 
TGAGGCCACCATGGTTCTGGTCTCCTGGATCATGTGGTACGCCCCTGTGGGCATCATGTTCCTGGTG 
GCTGGCAAGATCGTGGAGATGGAGGATG TGGGTTTACTCTTTGCCCGC CTTGGCAAGTACATTC TGT 
GCTGCCTGCTGGGTCACGCCATCCATGGGCTCCTGGTACTGCCCCTCATCTACTTCCTCTTCACCCG 
CAAAAACCCCTACCGCTTCCTGTGGGGCATCGTGACGCCGCTGGCCACTGCCTTTGGGACCTCTTCC 
AGTTCCGCCACGCTGCCGCTGATGATGAAGTGCGTGGAGGAGAATAATGGCGTGGCCAAGCACATCA 
GCCGTTTCATCCTGCCCATCGGCGCCACCGTCAACATGGACGGTGCCGCGCTCTTCCAGTGCGTGGC 
CGCAGTGTTCAT0X3CACAGCT(^GCCAGCAGTCCTTGGACTTCGTAAAGATCATCyvCCATCCTGGTC 
ACGGCCACAGCGTCCAGCGTGGGGGCAGCGGGCATCCCTGCTGGAGGTGTCCTCACTCTGGCCATCA 
TCCTCGAAGCAGTCAACCTCCCGGTCGACCATATCTCCTTGATCCTGGCTGTGGACTGGCTAGTCGA 
CCGGTCCTGTACCGTCCTCAATGTAGAAGGTGACGCTCTGGGGGCAC^ACTCCTCCAAAATTATGTG 
GACCGTACGGAGTCGAGAAGCACAGAGCCTGAGTTGATACAAGTGAAGAGTGAGCTGCCCCTGGATC 
CGCTGCCAGTCCCCACTGAGGAAGGAAACCCCCTCCTCAAACACTATCGGGGGCCCGCAGGGGATGC 
CACGGTCGCCTCTGAGAAGGAATCAGTCATGTAAACCCCGGGAGGGACCTTCCCTGCCCTGCTGGGR 


GTGCTC TTTGGACACTGGAT TATGAGGAATGGATAAATGGATGAGCTAGGGC TCTGGGGGTCTGCCT 


GCACACTC TGGGGAGCCAGGGGC CCCAGCACCC TCC AGG AC AGGAGAT CTGGGATGCC TGGCTGC TG 


GAGTACATGTGTTCACAAGGGTTACTCCTCAAAACCCCCAGTTCTCACTCATGTCCCCAACTCAAGG 


CTAGAAAACAGCAAGATGGAGAAATAATGTTCTGCTGCGTCCCCACCGTGACCTGCCTGGCCTCCCC 


TGTCTCAGGGAGCAGGTCACAGGTCACCATGGGGAATTCTAGCCCCCACTGGGGGGATGTTACAACA 


CCATGCTGGTTATTTTGGCGGCTGTAGTTGTGGGGGGATGTGTGTGTGCACGTGTGTGTGTGTO 


GTGTGTGTGTGTGTGTGTGTTCTGTGACCTCCTGTCCCCATGGTACGTCCCACCCTGTCCCCAGATC 


CCCTATTCCCTCCACAATAACAGAAACACTCCCAGGGACTCTGGGGAGAGGCTGAGGACAAATACCT 


GCTGTCACrcCAGAGGACATTTTTTTTAGCAATAAAATTGAGTGTCAACTATTAAAAAAAAAAAAAA 


AAAA 




ORF Start: ATG at 620 | "joRF Stop: TAA at 2243 





SEQ ID NO: 246 "7541 aa ]MW at 56620.6kD 


NOV55a, 
CG96736-01 
Protein Sequence 


MVADPPRDSKGLAAAEPPPTGAWQLASIEDQGAAAGGYCGSI^^ 

LGLGVSGAGGAL ALGPGAIiEAFVFPGEI»LLRLLRMI I LPLWC SL IGG AASLDPGALGRLGAWALLF 
FLVTTLLASALGVGLAIALQPGAASAAINASVGAAGSAENAPSKEVLDSFLDLARmFPSWLVSAAF 
RSYSTTYEERNITGTRVKVPVGQEVEGMNILGLV\^^ 

LVSWIMWYAPVG IMFL VAGKI VEMEDVGLLFARLGK YI LCCLLGHAIHGLLVLPL IYFLFTRKNPYR 
FLWG I VTPIiATAFGTS SSSATLPLMMKC VEENNGVAKH XSRF ILP IGATVNMDGAALFQCVAAVFI A 
QLSQQSLDFVKIITILVTATASSVGAAGIPAGGVLTLAIILEAVNLPVDHISLILAVDWLVDRSCTV 
LNVEGDALGAGIiLQNYVDRTESRSTEPKLIQVKSELPLDPLPVPTEEGNPLLKHYRGPAGDA'TV'ASE 

kesvm: 





SEQ ID NO: 247 j2017 bp 


NOV55b, 
CG96736-02 
DNA Sequence 


CGTACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAG 




GAGACCCAAGCTGGCTAGCGTTTAAACTTAAGCTTGGTACCGAGCTCGGATCCACTAGTCCAGTGTG 
GTGGAATTCCACCATGGTGGCCGATCCTCCTCGAGACTCCAAGGGGCTCGCAGCGGCGGAGCCCACC 
GCCAACGGGGGCCTGGCGCTGGCCTCCATCGAGGACCAAGGCGCGGCAGCAGGCGGCTACTGCGGTT 
CCCGGGACCAGGTGCGCCGCTGCCTTCGAGCCAACCTGCTTGTGCTGCTGACAGTGGTGGCCGTGGT 
GGCCGGCGTGGCGCTGGGACTGGGGGTGTCGGGGGCCGGGGGTGCGCTGGCGTTGGGCCCGGAGCGC 
TrGAGCGCCTTTCGTCTTCCCGGGCGAGCTGCTGCTG 




CTGCAGCCGGGCGCCGCCTCCGCCGCCATCAACGCCTCCGTGGGAGCCGCGGGCAGTGCCGAAAATG 
CCCCCAGCAAGGAGGTGCTCGATTCGTTCCTGGATCTTGCGAGAAATATCTTCCCTTCCAACCTGGT 
GTCAGCAGCCTTTCGCTCATACTCTACCACCTATGAAGAGAGGAATATCACCGGAACCAGGGTGAAG 
GTGCCCGTGGGGCAGGAGGTGGAGGGGATGAACATCCTGGGCTTGGTAGTGTTTGCCATCGTCTTTG 
GTGTGGCGCTGCGGAAGCTGGGGCCTGAAGGGGAGCTGCTTATCCGCTTCTTCAACTCCTTCAATGA 
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GGCC^CCATGGTTCTGGTCTCCTGGATCATCTGGT^ 

GGCAAGATCGTGGAGATGGAGGATGTGGGTTTACTCTTTGCCCGCCTTGGCAAGTACATTCTGTGCT 
GCCTGCTGGGTCACGCCATCCATGGGCTCCTGGTACTGCCCCTCATCTACTTCCTCTTCACCCGCAA 
AAACCCCTACCGCTTCCTGTGGGGCATCGTGACGCCGCTGGCCACTGCCTTTGGGACCTCTTCCAGT 
TCCGCCACGCTGCCGCTGATGATGAAGTGCGTGGAGGAGAATAATGGCGTGGCCAAGCACATCAGCC 
GTTTCATCCTGCCCATCGGCGCCACCGTCAACATGGACGGTGCCGCGCTCTTCCAGTGCGTGGCCGC 
AGTGTTCATTGCACAGCTCAGCCAGCAGTCCTTGGACTTCGTAAAGATCATCACCATCCTGGTCACG 
GCCACAGCGTCCAGCGTGGGGGCAGCGGGCATCCCTGCTGGAGGTGTCCTCACTCTGGCCATCATCC 
TCGAAGCAGTCAACCTCCCGGTCGACCATATCTCCTTGATCCTGGCTGTGGACTGGCTAGTCGACCG 
GTCCTGTACCGTCCTCAATGTAGAAGGTGACGCTCTGGGGGCAGGACTCCTCCAAAATTACGTGGAC 
CGTACGGAGTCGAGAAGC^CAGAGCCTGAGTTGATACAAGTGAAGAGTGAGCTGCCCCTGGATCCGC 
TGCCAGTCCCCACTGAGGAAGGAAACCCCCTCCTCAAACACTATCGGGGGCCCGCAGGGGATGCCAC 
GGTCGCCTCTGAGAAGGAATCAGTCATGT2^AGCGGCCGCTCGAGTCTAGAGGGCCCGTTTAAACCCG 


CTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCC 


TTGACCC TGG AAGGTGCCAC TC CC ACTGTCCTTTCCTAATAAAATGAGGAAATTGCATCGC ATTGTC 


TGAGTAG 




ORF Start: at 134 | JoRF Stop: TAA at 1838 





SEQ ID NO: 248 |568 aa "jlVEW at 59557.8kD 


NOV55b, 
CG96736-02 
Protein Sequence 


GDPSWIAFKLKLGTELGSTS PVWNSTMVADPPRDSKGLAAAEPTANGGXiAIiAS IEDQGAAAGGYCG 
SRDQVRRCLRANLLVLL IVVAWAGVALGLGVSGAGGALALGP ERLSAFVF PGELLLRLLRMI I LPL 
WC SL IGGAASIJ!)PGALGRLGAWALLPFLVTTLLASALGVGLAL AIjQ PG AAS AA I NAS VGAAGSAEN 
APSKEVLDSFLDLARNI FPSNLVSAAFRSYSTTYEERNI TGTRVKVPVGQEVEGMNILGL WFAI VF 
GVALRKLGPEGELL IRFFNSFNEATMVLVS WIMWYAPVG IMFLVAGKIVEMEDVGLLFARLGKYI LC 
CLLGHAIHGL L VLPL IYFIiFTKKNP YRFLWGIVTPLATAFGTSSS S ATLPLMMKCVEENNG VAKHI S 
RFI LP IGATVNMJGAALFQC VAAVF IAQLSQQSLDFVKI IT I LVTATAS SVGAAG IPAGGVL TLAI I 
LEAVISrLPVDHISLII^VDmjVE^ 
L P VPTEEGNP L LKHYRG PAGDATVA S EKES VM 



Sequence comparison of the above proteip sequences yields the following sequence 
relationships shown in Table 55B. 



Table 55B. Comparison of NOV55a against NOV55b. 


Protein Sequence 


NOV55a Residues/ 
Match Residues 


Identities/ 

Similarities for the Matched Region 


NOV55b 


1..541 
28..568 


423/541 (78%) 
423/541 (78%) 



Further analysis of the NOV55a protein yielded the following properties shown in 
Table 55C. 



Table 55C. Protein Sequence Properties NOV55a 


PSort analysis: 


0.6000 probability located in plasma membrane; 0.4000 probability located in 
Golgi body; 03000 probability located in endoplasmic reticulum (membrane); 
0.3000 probability located in microbody (peroxisome) 
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SignalP analysis: Cleavage site between residues 70 and 7 1 



A search of the NOV55a protein against the Geneseq database, a proprietary 
database that contains sequences published in patents and patent publication, yielded 
5 several homologous proteins shown in Table 55D. 



Table 55D. Geneseq Results for NOV55a 


Geneseq 
Identifier 


Protein/Organism/Length 
[Patent*, Date] 


NOV55a 
Residues/ 
Match 
Residues 


Identities/ 
Similarities for 
the Matched 
Region 


Expect 
Value 


ABG61858 


Prostate cancer-associated 
protein #59 - Mammalia, 541 
aa. [WO200230268-A2, 
18-APR-2002] 


1..541 
1..541 


531/541 (98%) 
531/541 (98%) 


0.0 


AAR95044 


Apoptosis participating 
protein - Homo sapiens, 514 
aa. [JP08089257-A, 
09-APR-1996] 


1..513 
1..513 


499/513 (97%) 
499/513 (97%) 


0.0 


AAY78144 


Human neutral amino acid 
transporter ASCT1 - Homo 
sapiens, 532 aa. 
[US6020479-A, 
01-FEB-20O0] 


32..541 
21..532 


314/521 (60%) 
378/521 (72%) 


e-161 


AAY99961 


Human amino acid 
transporter ASCT1 protein - 
Homo sapiens, 532 aa. 
[US6074828-A, 
13-JUN-2000] 


32..541 
21..532 


314/521 (60%) 
378/521 (72%) 


e-161 


AAY97139 


ASCT1 human neutral amino 
acid transporter protein - 
Homo sapiens, 532 aa. 
[US6100085-A, 
08-AUG-2000] 


32..541 

21..532 I 


314/521 (60%) 
378/521 (72%) 1 


e-161 



/ 

3 In a BLAST search of public sequence datbases, the NOV55a protein was found to 

have homology to the proteins shown in the BLASTP data in Table 55E. 



Table 55E. Public BLASTP Results for NOV55a 
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Protein 

Accession 

Number 


Protein/Organism/Length 


NOV55a 
Residues/ 
Match 
Residues 


IL* 11 .* _t-i» wii a_« tt_.. 

Identities/ 
Similarities for 
the Matched 
Portion 


hZIS mM.» JiZi* jl*" 

Expect 
Value 


AAD09814 


Neutral amino acid 
transporter - Homo sapiens 
(Human), 541 aa. 


1..541 
1..541 


532/541 (98%) 
532/541 (98%) 


0.0 




Neutral amino acid 
transporter B(0) (ATB(O)) 
(Sodium-dependent neutral 
amino acid transporter type 
2) (RD114/simian type D 
retrovirus receptor) (Baboon 
M7 virus receptor) - Homo 
sapiens (Human), 541 aa. 


1..541 
1..541 


531/541 (98%) 
531/541 (98%) 


0.0 


019105 


Neutral amino acid 
transporter B(0) (ATB(O)) 
(Sodium-dependent neutral 
amino acid transporter type 
2) - Oryctolagus cuniculus 
(Rabbit), 541 aa. 


1..541 
1.541 


459/542 (84%) 
485/542 (88%) 


0.0 


Q95JC7 


Neutral amino acid 
transporter B(0) (ATB(0)) 
(Sodium-dependent neutral 
amino acid transporter type 
2) - Bos taurus (Bovine), 539 
aa 


1..541 

1..539 


465/542 (85%) 
486/542 (88%) 


0.0 


AAM94351 


^a-f-dependent amino acid 
transporter ASCT2 - Rattus 
norvegicus (Rat), 551 aa. 


1..541 
1..551 


445/553 (80%) 
471/553 (84%) 


0.0 



PFam analysis predicts that the NOV55a protein contains the domains shown in the 
Table 55F. 



Table 55F. Domain Analysis of NOV55a 


Pfam Domain 


NOV55a Match Region 


Identities/ 
Similarities 

for the Matched Region 


Expect Value 


SDF 


54..48S 


195/465(42%) 
373/465(80%) 


1.5e-178 
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Example B: Sequencing Methodology and Identificat^^Tltfii^jS^ 

1. GeneCalling™ Technology: This is a proprietary method of performing 
differential gene expression profiling between two or more samples developed at CuraGen 
and described by Shimkets, et al., "Gene expression analysis by transcript profiling coupled 
to a gene database query" Nature Biotechnology 17:198-803 (1999). cDNA was derived 
from various human samples representing multiple tissue types, normal and diseased states, 
physiological states, and developmental states from different donors. Samples were 
obtained as whole tissue, primary cells or tissue cultured primary cells or cell lines. Cells 
and cell lines may have been treated with biological or chemical agents that regulate gene 
expression, for example, growth factors, chemokines or steroids. The cDNA thus derived 
was then digested with up to as many as 120 pairs of restriction enzymes and pairs of 
linker-adaptors specific for each pair of restriction enzymes were ligated to the appropriate 
end. The restriction digestion generates a mixture of unique cDNA gene fragments. 
Limited PCR amplification is performed with primers homologous to the linker adapter 
sequence where one primer is biotinylated and the other is fluorescently labeled. The 
doubly labeled material is isolated and the fluorescently labeled single strand is resolved by 
capillary gel electrophoresis. A computer algorithm compares the electropherograms from 
an experimental and control group for each of the restriction digestions. This and additional 
sequence-derived information is used to predict the identity of each differentially expressed 
gene fragment using a variety of genetic databases. The identity of the gene fragment is 
confirmed by additional, gene-specific competitive PCR or by isolation and sequencing of 
the gene fragment. 

2. SeqCalling™ Technology: cDNA was derived from various human 
samples representing multiple tissue types, normal and diseased states, physiological states, 
and developmental states from different donors. Samples were obtained as whole tissue, 
primary cells or tissue cultured primary cells or cell lines. Cells and cell lines may have 
been treated with biological or chemical agents that regulate gene expression, for example, 
growth factors, chemokines or steroids. The cDNA thus derived was then sequenced using 
CuraGen's proprietary SeqCalling technology. Sequence traces were evaluated manually 
and edited for corrections if appropriate. cDNA sequences from all samples were 
assembled together, sometimes including public human sequences, using bioinformatic 
programs to produce a consensus sequence for each assembly. Each assembly is included in 
CuraGen Corporation's database. Sequences were included as components for assembly 
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when the extent of identity with another component was ftihh^^S^^^^^^ih^ 
assembly represents a gene or portion thereof and includes information on variants, such as 
splice forms single nucleotide polymorphisms (SNPs), insertions, deletions and other 
sequence variations. 

3. PathCalling™ Technology: The NOVX nucleic acid sequences are 
derived by laboratory screening of cDNA library by the two-hybrid approach. cDNA 
fragments covering either the full length of the DNA sequence, or part of the sequence, or 
both, are sequenced. In silico prediction was based on sequences available in CuraGen 
Corporation's proprietary sequence databases or in the public human sequence databases, 
and provided either the full length DNA sequence, or some portion thereof. 

The laboratory screening was performed using the methods summarized below: 
cDNA libraries were derived from various human samples representing multiple 
tissue types, normal and diseased states, physiological states, and developmental states 
from different donors. Samples were obtained as whole tissue, primary cells or tissue 
cultured primary cells or cell lines. Cells and cell lines may have been treated with 
biological or chemical agents that regulate gene expression, for example, growth factors, 
chemokines or steroids. The cDNA thus derived was then directionally cloned into the 
appropriate two-hybrid vector (GaI4-activation domain (Gal4-AD) fusion). Such cDNA 
libraries as well as commercially available cDNA libraries from Clontech (Palo Alto, CA) 
were then transferred from Rcoli into a CuraGen Corporation proprietary yeast strain 
(disclosed in U. S. Patents 6,057,101 and 6,083,693, incorporated herein by reference in 
their entireties). 

Gal4-binding domain (Gal4-BD) fusions of a CuraGen Corportion proprietary 
library of human sequences was used to screen multiple Gal4-AD fusion cDNA libraries 
resulting in the selection of yeast hybrid diploids in each of which the Gal4-AD fusion 
contains an individual cDNA. Each sample was amplified using the polymerase chain 
reaction (PCR) using non-specific primers at the cDNA insert boundaries. Such PCR 
product was sequenced; sequence traces were evaluated manually and edited for 
corrections if appropriate. cDNA sequences from all samples were assembled together, 
sometimes including public human sequences, using bioinformatic programs to produce a 
consensus sequence for each assembly. Each assembly is included in CuraGen 
Corporation's database. Sequences were included as components for assembly when the 
extent of identity with another component was at least 95% over 50 bp. Each assembly 
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represents a gene or portion thereof and includes informatiotToii «aftSiltS^c* ,r ar^lte? jiP ^ 
forms single nucleotide polymorphisms (SNPs), insertions, deletions and other sequence 
variations. 

Physical clone: the cDNA fragment derived by the screening procedure, covering 
the entire open reading frame is, as a recombinant DNA, cloned into pACT2 plasmid 
(Clontech) used to make the cDNA library. The recombinant plasmid is inserted into the 
host and selected by the yeast hybrid diploid generated during the screening procedure by 
the mating of both CuraGen Corporation proprietary yeast strains N106' and YULH (U. S. 
Patents 6,057,101 and 6,083,693). 

4. RACE: Techniques based on the polymerase chain reaction such as rapid 
amplification of cDNA ends (RACE), were used to isolate or complete the predicted 
sequence of the cDNA of the invention. Usually multiple clones were sequenced from one 
or more human samples to derive the sequences for fragments. Various human tissue 
samples from different donors were used for the RACE reaction.^The sequences derived 
from these procedures were included in the SeqCalling Assembly process described in 
preceding paragraphs. 

5. Exoni Linking: The NOVX target sequences identified in the present 
invention were subjected to the exon linking process to confirm the sequence. PGR 
primers were designed by starting at the most upstream sequence available, for the forward 
primer, and at the most downstream sequence available for the reverse primer. In each 
case, the sequence was examined, walking inward from the respective termini toward the 
coding sequence, until a suitable sequence that is either unique or highly selective was 
encountered, or, in the case of the reverse primer, until the stop codon was reached. Such 
primers were designed based on in silico predictions for the full length cDNA, part (one or 
more exons) of the DNA or protein sequence of the target sequence, or by translated 
homology of the predicted exons to closely related human sequences from other species. 
These primers were then employed in PCR amplification based on the following pool of 
human cDNAs: adrenal gland, bone marrow, brain - amygdala, brain - cerebellum, brain - 
hippocampus, brain - substantia nigra, brain - thalamus, brain -whole, fetal brain, fetal 
kidney, fetal liver, fetal lung, heart, kidney, lymphoma - Raji, mammary gland, pancreas, 
pituitary gland, placenta, prostate, salivary gland, skeletal muscle, small intestine, spinal 
cord, spleen, stomach, testis, thyroid, trachea, uterus. Usually the resulting amplicons were 
gel purified, cloned and sequenced to high redundancy. The PCR product derived from 
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exon linking was cloned into the pCR2.1 vector from Invf^en/Weresu^g bfcferifl Jtf ^ 
clone has an insert covering the entire open reading frame cloned into the pCR2.1 vector. 
The resulting sequences from all clones were assembled with themselves, with other 
fragments in CuraGen Corporation's database and with public ESTs. Fragments and ESTs 
were included as components for an assembly when the extent of their identity with another 
component of the assembly was at least 95% over 50 bp. In addition, sequence traces were 
evaluated manually and edited for corrections if appropriate. These procedures provide the 
sequence reported herein . 

6. Physical Clone: Exons were predicted by homology arid the intron/exon 
boundaries were determined using standard genetic rules. Exons were further selected and 
refined by means of similarity determination using multiple BLAST (for example, tBlastN, 
BlastX, and BlastN) searches, and, in some instances, GeneScan and Grail. Expressed 
sequences from both public and proprietary databases were also added when available to 
further define and complete the gene sequence. The DNA sequence was then manually 
corrected for apparent inconsistencies thereby obtaining the sequences encoding the 
full-length protein. 

The PCR product derived by exon linking, covering the entire open reading frame, 
was cloned into the pCR2.1 vector from Invitrogen to provide clones used for expression 
and screening purposes. 

Example C: Quantitative expression analysis of clones in various cells and tissues 

The quantitative expression of various clones was assessed using microtiter plates 
containing RNA samples from a variety of normal and pathology-derived cells, cell lines 
and tissues using real time quantitative PCR (RTQ PCR). RTQ PCR was performed on an 
Applied Biosystems ABI PRISM® 7700 or an ABI PRISM® 7900 HT Sequence Detection 
System. Various collections of samples are assembled on the plates, and referred to as 
Panel 1 (containing normal tissues and cancer cell lines), Panel 2 (containing samples 
derived from tissues from normal and cancer sources), Panel 3 (containing cancer cell 
lines), Panel 4 (containing cells and cell lines from normal tissues and cells related to 
inflammatory conditions), Panel 5D/5I (containing human tissues and cell lines with an 
emphasis on metabolic diseases), AI_comprehensive_panel (containing normal tissue and 
samples from autoinflammatory diseases), Panel CNSD.01 (containing samples from 
normal and diseased brains) and CNS_neurodegeneration_panel (containing samples from 
normal and Alzheimer's diseased brains). 
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RNA integrity from all samples is controlled for quality visual ,r asreYsmehW ' ~" a 
agarose gel electropherograms using 28S and 18S ribosomal RNA staining intensity ratio 
as a guide (2: 1 to 2.5:1 28s:18s) and the absence of low molecular weight RNAs that would 
be indicative of degradation products. Samples are controlled against genomic DNA 
contamination by RTQ PCR reactions run in the absence of reverse transcriptase using 
probe and primer sets designed to amplify across the span of a single exon. 

First, the RNA samples were normalized to reference nucleic acids such as 
constitutively expressed genes (for example, 0-actin and GAPDH). Normalized RNA (5 ul) 
was converted to cDNA and analyzed by RTQ-PCR using One Step RT-PCR Master Mix 
Reagents (Applied Biosystems; Catalog No. 4309169) and gene-specific primers according 
to the manufacturer's instructions. 

In other cases, non-normalized RNA samples were converted to single strand cDNA 
(sscDNA) using Superscript II (Invitrogen Corporation; Catalog No. 18064-147) and 
random hexamers according to the manufacturer's instructions. Reactions containing up to 
10 /xg of total RNA were performed in a volume of 20 jLtl and incubated for 60 minutes at 
42°C. This reaction can be scaled up to 50 /ig of total RNA in a final volume of 100 jd. 
sscDNA samples are then normalized to reference nucleic acids as described previously, 
using IX TaqMan® Universal Master mix (Applied Biosystems; catalog No. 4324020), 
following the manufacturer's instructions. 

Probes and primers were designed for each assay according to Applied Biosystems 
Primer Express Software package (version I for Apple Computer's Macintosh PowerPC) or 
a similar algorithm using the target sequence as input. Default settings were used for 
reaction conditions and the following parameters were set before selecting primers: primer 
concentration = 250 nM, primer melting temperature (Tm) range = 58°-60°C, primer 
optimal Tm = 59°C, maximum primer difference = 2°C, probe does not have 5'G, probe Tm 
must be 10°C greater than primer Tm, amplicon size 75bp to lOObp. The probes and 
primers selected (see below) were synthesized by Synthegen (Houston, TX, USA). Probes 
were double purified by HPLC to remove uncoupled dye and evaluated by mass 
spectroscopy to verify coupling of reporter and quencher dyes to the 5' and 3' ends of the 
probe, respectively. Their final concentrations were: forward and reverse primers, 900nM 
each, and probe, 200nM. 

PCR conditions: When working with RNA samples, normalized RNA from each 
tissue and each cell line was spotted in each well of either a 96 well or a 384-well PCR 
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plate (Applied Biosystems). PGR cocktails included either a single gene specific pfote'and ~* 
primers set, or two multiplexed probe and primers sets (a set specific for the target clone 
and another gene-specific set multiplexed with the target probe). PCR reactions were set up 
using TaqMan® One-Step RT-PCR Master Mix (Applied Biosystems, Catalog No. 
5 43 1 3803) following manufacturer's instructions. Reverse transcription was perfomied at 
48°C for 30 minutes followed by amplification/PCR cycles as follows: 95°C 10 min, then 
40 cycles of 95°C for 15 seconds, 60°C for 1 minute. Results were recorded as CT values 
(cycle at which a given sample crosses a threshold level of fluorescence) using a log scale, 
with the difference in RNA concentration between a given sample and the sample with the 
10 lowest CT value being represented as 2 to the power of delta CT. The percent relative 

expression is then obtained by taking the reciprocal of this RNA difference and multiplying 
by 100. 

When working with sscDNA samples, normalized sscDNA was used as described 
previously for RNA samples. PCR reactions containing one or two sets of probe and 

15 primers were set up as described previously, using IX TaqMan® Universal Master mix 
(Applied Biosystems; catalog No. 4324020), following the manufacturer's instructions. 
PCR amplification was performed as follows: 95°C 10 min, then 40 cycles of 95°C for 15 
seconds, 60°C for 1 minute. Results were analyzed and processed as described previously. 
Panels 1, 1.1, 1.2, and 1 3D 

20 The plates for Panels 1, 1.1, 1.2 and 1.3D include 2 control wells (genomic DNA 

control and chemistry control) and 94 wells containing cDNA from various samples. The 
samples in these panels are broken into 2 classes: samples derived from cultured cell lines 
and samples derived from primary normal tissues. The cell lines are derived from cancers 
of the following types: lung cancer, breast cancer, melanoma, colon cancer, prostate cancer, 

25 CNS cancer, squamous cell carcinoma, ovarian cancer, liver cancer, renal cancer, gastric 
cancer and pancreatic cancer. Cell lines used in these panels are widely available through 
the American Type Culture Collection (ATCC), a repository for cultured cell lines, and 
were cultured using the conditions recommended by the ATCC. The normal tissues found 
on these panels are comprised of samples derived from all major organ systems from single 

30 adult individuals or fetuses. These samples are derived from the following organs: adult 
skeletal muscle, fetal skeletal muscle, adult heart, fetal heart, adult kidney, fetal kidney, 
adult liver, fetal liver, adult lung, fetal lung, various regions of the brain, the spleen, bone 
marrow, lymph node, pancreas, salivary gland, pituitary gland, adrenal gland, spinal cord, 
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thymus, stomach, small intestine, colon, bladder, trachea, r bre&sl!, 'SvM^xMm^ plicen^ 41 " ' 3 
o prostate, testis and adipose. 

In the results for Panels 1, 1.1, 1.2 and 1.3D, the following abbreviations are used: 
ca. = carcinoma, 
5 * = established from metastasis, 

met = metastasis, 
s cell var = small cell variant, 
non-s = non-sm - non-small, 
squam = squamous, 
10 pi. eff = pi effusion = pleural effusion, 

glio = glioma, 
astro = astrocytoma, and 
neuro = neuroblastoma. 

GeneraLscreening_paneLvl.4, vl.5 and vl.6 

15 The plates for Panels 1 .4, 1 .5, and 1 .6 include 2 control wells (genomic DNA 

control and chemistry control) and 94 wells containing cDNA from various samples. The 
samples in Panels 1.4, 1.5, and 1.6 are broken into 2 classes: samples derived from cultured 
cell lines and samples derived from primary normal tissues. The cell lines are derived from 
cancers of the following types: lung cancer, breast cancer, melanoma, colon cancer, 

20 prostate cancer, CNS cancer, squamous cell carcinoma, ovarian cancer, liver cancer, renal 
cancer, gastric cancer and pancreatic cancer. Cell lines used in Panels 1.4, 1.5, and 1.6 are 
widely available through the American Type Culture Collection (ATCC), a repository for 
cultured cell lines, and were cultured using the conditions recommended by the ATCC. The 
normal tissues found on Panels 1.4, 1.5, and 1.6 are comprised of pools of samples derived 

25 from all major organ systems from 2 to 5 different adult individuals or fetuses. These 

samples are derived from the following organs: adult skeletal muscle, fetal skeletal muscle, 
adult heart, fetal heart, adult kidney, fetal kidney, adult liver, fetal liver, adult lung, fetal 
lung, various regions of the brain, the spleen, bone marrow, lymph node, pancreas, salivary 
gland, pituitary gland, adrenal gland, spinal cord, thymus, stomach, small intestine, colon, 

30 bladder, trachea, breast, ovary, uterus, placenta, prostate, testis and adipose. Abbreviations 
are as described for Panels 1, 1.1, 1.2, and 1.3D. 
Panels 2D, 2.2, 2.3 and 2.4 
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The plates for Panels 2D, 2.2, 2.3 and 2.4 generally fhtliide tttlA¥tf^tia$f ^ 
test samples composed of RNA or cDNA isolated from human tissue procured by surgeons 
working in close cooperation with the National Cancer Institute's Cooperative Human 
Tissue Network (CHTN) or the National Disease Research Initiative (NDRI) or from 
Ardais or Clinomics). The tissues are derived from human malignancies and in cases where 
indicated many malignant tissues have "matched margins" obtained from noncancerous 
tissue just adjacent to the tumor. These are termed normal adjacent tissues and are denoted 
"NAT" in the results below. The tumor tissue and the "matched margins" are evaluated by 
two independent pathologists (the surgical pathologists and again by a pathologist at NDRI/ 
CHTN/Ardais/Clinomics). Unmatched RNA samples from tissues without malignancy 
(normal tissues) were also obtained from Ardais or Clinomics. This analysis provides a 
gross histopathological assessment of tumor differentiation grade. Moreover, most samples 
include the original surgical pathology report that provides information regarding the 
clinical stage of the patient. These matched margins are taken from the tissue surrounding 
(i.e. immediately proximal) to the zone of surgery (designated "NAT", for normal adjacent 
tissue, in Table RR). In addition, RNA and cDNA samples were obtained from various 
human tissues derived from autopsies performed on elderly people or s,udden death victims 
(accidents, etc.). These tissues were ascertained to be free of disease and were purchased 
from various commercial sources such as Clontech (Palo Alto, CA), Research Genetics, 
and Invitrogen. 

HASS Panel vl.O 

The HASS panel v 1.0 plates are comprised of 93 cDNA samples and two controls. 
Specifically, 81 of these samples are derived from cultured human cancer cell lines that had 
been subjected to serum starvation, acidosis and anoxia for different time periods as well as 
controls for these treatments, 3 samples of human primary cells, 9 samples of malignant 
brain cancer (4 medulloblastomas and 5 glioblastomas) and 2 controls. The human cancer 
cell lines are obtained from ATCC (American Type Culture Collection) and fall into the 
following tissue groups: breast cancer, prostate cancer, bladder carcinomas, pancreatic 
cancers and CNS cancer cell lines. These cancer cells are all cultured under standard 
c recommended conditions. The treatments used (serum starvation, acidosis and anoxia) have 
been previously published in the scientific literature. The primary human cells were 
obtained from Clonetics (Walkersville, MD) and were grown in the media and conditions 
recommended by Clonetics. The malignant brain cancer samples are obtained as part of a 
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collaboration (Henry Ford Cancer Center) and are evaluaficfoy a U^^&^pn^o' ^ * * 
CuraGen receiving the samples . RNA was prepared from these samples using the standard 
procedures. The genomic and chemistry control wells have been described previously. 
ARDAIS Panel v 1.0 

The plates for ARDAIS panel v 1.0 generally include 2 control wells and 22 test 
samples composed of RNA isolated from human tissue procured by surgeons working in 
close cooperation with Ardais Corporation. The tissues are derived from human lung 
malignancies (lung adenocarcinoma or lung squamous cell carcinoma) and in cases where 
indicated many malignant samples have "matched margins" obtained from noncancerous 
lung tissue just adjacent to the tumor. These matched margins are taken from the tissue 
surrounding (i.e. immediately proximal) to the zone of surgery (designated "NAT", for 
normal adjacent tissue) in the results below. The tumor tissue and the "matched margins" 
are evaluated by independent pathologists (the surgical pathologists and again by a 
pathologist at Ardais). Unmatched malignant and non-malignant RNA samples from lungs 
were also obtained from Ardais. Additional information from Ardais provides a gross 
histopathological assessment of tumor differentiation grade and stage. Moreover, most 
samples include the original surgical pathology report that provides information regarding 
the clinical state of the patient. 

Panel 3D, 3.1 and 3.2 

The plates of Panel 3D, 3.1, and 3.2 are comprised of 94 cDNA samples and two 
control samples. Specifically, 92 of these samples are derived from cultured human cancer 
cell lines, 2 samples of human primary cerebellar tissue and 2 controls. The human cell 
lines are generally obtained from ATCC (American Type Culture Collection), NCI or the 
German tumor cell bank and fall into the following tissue groups: Squamous cell carcinoma 
of the tongue, breast cancer, prostate cancer, melanoma, epidermoid carcinoma, sarcomas, 
bladder carcinomas, pancreatic cancers, kidney cancers, leukemias/lymphomas, 
ovarian/uterine/cervical, gastric, colon, lung and CNS cancer cell lines. In addition, there 
are two independent samples of cerebellum. These cells are all cultured under standard 
recommended conditions and RNA extracted using the standard procedures. The cell lines 
in panel 3D, 3.1, 3.2, 1, U., 1.2, 1.3D, 1.4, 1.5, and 1.6 are of the most common cell lines 
used in the scientific literature. 

Panels 4D,4R, and 4.1D 
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Panel 4 includes samples on a 96 well plate (2 controTwefls^41es1 samplesf * ~* 
composed of RNA (Panel 4R) or cDNA (Panels 4D/4.1D) isolated from various human cell 
lines or tissues related to inflammatory conditions. Total RNA from control normal tissues 
such as colon and lung (Stratagene, La Jolla, CA) and thymus and kidney (Clontech) was 
employed. Total RNA from liver tissue from cirrhosis patients and kidney from lupus 
patients was obtained from BioChain (Biochain Institute, Inc., Hayward, CA). Intestinal 
tissue for RNA preparation from patients diagnosed as having Crohn's disease and 
ulcerative colitis was obtained from the National Disease Research Interchange (NDRI) 
(Philadelphia, PA). 

Astrocytes, lung fibroblasts, dermal fibroblasts, coronary artery smooth muscle 
cells, small airway epithelium, bronchial epithelium, microvascular dermal endothelial 
cells, microvascular lung endothelial cells, human pulmonary aortic endothelial cells, 
human umbilical vein endothelial cells were all purchased from Clonetics (Walkersville, 
MD) and grown in the media supplied for these cell types by Clonetics. These primary cell 
types were activated with various cytokines or combinations of cytokines for 6 and/or 
12-14 hours, as indicated. The following cytokines were used; IL-1 beta at approximately 
l-5ng/ml, TNF alpha at approximately 5-10ng/ml, IFN gamma at approximately 
20-50ng/ml, IL-4 at approximately 5-10ng/mi, IL-9 at approximately 5-10ng/ml, IL-13 at 
approximately 5-10ng/ml. Endothelial cells were sometimes starved for various times by 
culture in the basal media from Clonetics with 0.1% serum. 

Mononuclear cells were prepared from blood of employees at CuraGen 
Corporation, using Ficoll. LAK cells were prepared from these cells by culture in DMEM 
5% FCS (Hyclone), lOOjxM non essential amino acids (Gibco/Life Technologies, 
Rockville, MD), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xlO" 5 M (Gibco), and 
lOmM Hepes (Gibco) and Interleukin 2 for 4-6 days. Cells were then either activated with 
10-20ng/ml PMA and l-2/*g/ml ionomycin, IL-12 at 5-10ng/ml, IFN gamma at 20-50ng/ml 
and IL-18 at 5-10ng/ml for 6 hours. In some cases, mononuclear cells were cultured for 4-5 
days in DMEM 5% FCS (Hyclone), 100/iM non essential amino acids (Gibco), ImM 
sodium pyruvate (Gibco), mercaptoethanol 5.5xl0" 5 M (Gibco), and lOmM Hepes (Gibco) 
with PHA (phytohemagglutinin) or PWM (pokeweed mitogen) at approximately Sfig/ml. 
Samples were taken at 24, 48 and 72 hours for RNA preparation. MLR (mixed lymphocyte 
reaction) samples were obtained by taking blood from two donors, isolating the 
mononuclear cells using Ficoll and mixing the isolated mononuclear cells 1:1 at a final 
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concentration of approximately 2xl0 5 cells/ml in DMEM ftftt^Ctiyc^ 
essential amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 
(5.5xlO~ 5 M) (Gibco), and lOmM Hepes (Gibco). The MLR was cultured and samples taken 
at various time points ranging from 1-7 days for RNA preparation. 

Monocytes were isolated from mononuclear cells using CD 14 Miltenyi Beads, +ve 
VS selection columns and a Vario Magnet according to the manufacturer's instructions. 
Monocytes were differentiated into dendritic cells by culture in DMEM 5% fetal calf serum 
(FCS) (Hyclone, Logan, UT), lOOjuM non essential amino acids (Gibco), ImM sodium 
pyruvate (Gibco), mercaptoethanol 5.5xlO' 5 M (Gibco), and lOmM Hepes (Gibco), 50ng/ml 
GMCSF and 5ng/ml IL-4 for 5-7 days. Macrophages were prepared by culture of 
monocytes for 5-7 days in DMEM 5% FCS (Hyclone), 100/iM non essential amino acids 
(Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xlO" 5 M (Gibco), lOmM 
Hepes (Gibco) and 10% AB Human Serum or MCSF at approximately 50ng/ml. 
Monocytes, macrophages and dendritic cells were stimulated for 6 and 12-14 hours with 
lipopolysaccharide (LPS) at lOOng/ml. Dendritic cells were also stimulated with anti-CD40 
monoclonal antibody (Pharmingen) at 10/ug/ml for 6 and 12-14 hours. 

CD4 lymphocytes, CD8 lymphocytes and NK cells were also isolated from 
mononuclear cells using CD4, CD8 and CD56 Miltenyi beads, positive VS selection 
columns and a Vario Magnet according to the manufacturer's instructions. CD45RA and 
CD45RO CD4 lymphocytes were isolated by depleting mononuclear cells of CD8, CD56, 
CD14 and CD19 cells using CD8, CD56, CD14 and CD19 Miltenyi beads and positive 
selection. CD45RO beads were then used to isolate the CD45RO CD4 lymphocytes with 
the remaining cells being CD45RA CD4 lymphocytes. CD45RA CD4, CD45RO CD4 and 
CD8 lymphocytes were placed in DMEM 5% FCS (Hyclone), 100/*M non essential amino 
acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xl(T 5 M (Gibco), and 
lOmM Hepes (Gibco) and plated at 10 6 cells/ml onto Falcon 6 well tissue culture plates that 
had been coated overnight with 0.5/Ag/ml anti-CD28 (Phaimingen) and 3ug/ml antL-CD3 
(OKT3, ATCC) in PBS. After 6 and 24 hours, the cells were harvested for RNA 
preparation. To prepare chronically activated CD8 lymphocytes, we activated the isolated 
CD8 lymphocytes for 4 days on anti-CD28 and anti-CD3 coated plates and then harvested 
the cells and expanded them in DMEM 5% FCS (Hyclone), lOO/zM non essential amino 
acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xl0" 5 M (Gibco), and 
lOmM Hepes (Gibco) and IL-2. The expanded CD8 cells were then activated again with 
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plate bound anti-CD3 and anti-CD28 for 4 days and expanded as "beforeT KNA "was isolated 
6 and 24 hours after the second activation and after 4 days of the second expansion culture. 
The isolated NK cells were cultured in DMEM 5% FCS (Hyclone), 100/iM non essential 
amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x10% (Gibco), 
and lOmM Hepes (Gibco) and IL-2 for 4-6 days before RNA was prepared. 

To obtain B cells, tonsils were procured from NDRI. The tonsil was cut up with 
sterile dissecting scissors and then passed through a sieve. Tonsil cells were then spun 
down and resupended at 10 6 cells/ml in DMEM 5% FCS (Hyclone), 100/iM non essential 
amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xlO" 5 M (Gibco), 
and lOmM Hepes (Gibco). To activate the cells, we used PWM at 5/tg/ml or anti-CD40 
(Pharmingen) at approximately 10/ig/ml and IL-4 at 5-10ng/ml. Cells were harvested for 
RNA preparation at 24,48 and 72 hours. 

To prepare the primary and secondary Thl/Th2 and Trl cells, six-well Falcon plates 
were coated overnight with lOjig/ml anti-CD28 (Pharmingen) and 2figfml OKT3 (ATCC), 
and men washed twice with PBS. Umbilical cord blood CD4 lymphocytes (Poietic 
Systems, German Town, MD) were cultured at l0 s -10 t cells/ml in DMEM 5% FCS 
(Hyclone), 100/iM non essential amino acids (Gibco), ImM sodium pyruvate (Gibco), 
mercaptoethanol 5.5x10"% (Gibco), lOmM Hepes (Gibco) and TL-2 (4ng/ml). IL-12 
(5ng/ml) and anti-EL4 (1/ig/ml) were used to direct to Thl, while IL-4 (5ng/ml) and 
anti-IFN gamma (1/xg/ml) were used to direct to Th2 and IL-10 at 5ng/ml was used to 
direct to Trl. After 4-5 days, the activated Thl, Th2 and Trl lymphocytes were washed 
once in DMEM and expanded for 4-7 days in DMEM 5% FCS (Hyclone), 100/tM non 
essential amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x10% 
(Gibco), lOmM Hepes (Gibco) and IL-2 (lng/ml). Following this, the activated Thl, Th2 
and Trl lymphocytes were re-stimulated for 5 days with anti-CD28/OKT3 and cytokines as 
described above, but with the addition of anti-CD95L (1/tg/ml) to prevent apoptosis. After 
4-5 days, the Thl, Th2 and Trl lymphocytes were washed and then expanded again with 
IL-2 for 4-7 days. Activated Thl and Th2 lymphocytes were maintained in this way for a 
maximum of three cycles. RNA was prepared from primary and secondary Thl, Th2 and 
Trl after 6 and 24 hours following the second and third activations with plate bound 
anti-CD3 and anti-CD28 mAbs and 4 days into the second and third expansion cultures in 
Interleukin 2. 
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The following leukocyte cells lines were obtained from Ihe A1X!CT^CimoT^Ot-i, '~ 
KU-812. EOL cells were further differentiated by culture in O.lmM dbcAMP at 
5xl0 5 cells/ml for 8 days, changing the media every 3 days and adjusting the cell 
concentration to 5xl0 5 cells/ml. For the culture of these cells, we used DMEM or RPMI (as 
recommended by the ATCC), with the addition of 5% FCS (Hyclone), 100/iM non 
essential amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x1 0" 5 M 
(Gibco), lOmM Hepes (Gibco). RNA was either prepared from resting cells or cells 
activated with PMA at lOng/ml and ionomycin at ljug/ml for 6 and 14 hours. Keratinocyte 
line CCD 106 and an airway epithelial tumor line NCI-H292 were also obtained from the 
ATCC. Both were cultured in DMEM 5% FCS (Hyclone), 100/xMnon essential amino 
acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x1 0" 5 M (Gibco), and 
lOmM Hepes (Gibco). CCD1 106 cells were activated for 6 and 14 hours with 
approximately 5 ng/ml TNF alpha and lng/M EL-1 beta, while NCI-H292 cells were 
activated for 6 and 14 hours with the following cytokines: 5ng/ml IL-4, 5ng/ml IL-9, 
5ng/ml IL-13 and 25ng/ml IFN gamma. 

For these cell lines and blood cells, RNA was prepared by lysing approximately 
10 7 cells/ml using Trizol (Gibco BRL). Briefly, 1/10 volume of bromochloropropane 
(Molecular Research Corporation) was added to the RNA sample, vortexed and after 10 
minutes at room temperature, the tubes were spun at 14,000 rpm in a Sorvall SS34 rotor. 
The aqueous phase was removed and placed in a 15ml Falcon Tube. An equal volume of 
isopropanol was added and left at -20°C overnight. The precipitated RNA was spun down 
at 9,000 rpm for 15 min in a Sorvall SS34 rotor and washed in 70% ethanol. The pellet was 
redissolved in 300/xl of RNAse-free water and 35/il buffer (Promega) 5/il DTT, 7/xl 
RNAsin and 8/xl DNAse were added. The tube was incubated at 37°C for 30 minutes to 
remove contaminating genomic DNA, extracted once with phenol chloroform and 
re-precipitated with 1/10 volume of 3M sodium acetate and 2 volumes of 100% ethanol. 
The RNA was spun down and placed in RNAse free water. RNA was stored at -80°C. 

AI_comprehensive paneLvl.O 

The plates for AI_comprehensive paneLvl.O include two control wells and 89 test 
samples comprised of cDNA isolated from surgical and postmortem human tissues 
obtained from the Backus Hospital and Clinomics (Frederick, MD). Total RNA was 
extracted from tissue samples from the Backus Hospital in the Facility at CuraGen. Total 
RNA from other tissues was obtained from Clinomics. 
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Joint tissues including synovial fluid, synovium, bone and cartilage were o&iSnSf 
from patients undergoing total knee or hip replacement surgery at the Backus Hospital. 
Tissue samples were immediately snap frozen in liquid nitrogen to ensure that isolated 
RNA was of optimal quality and not degraded. Additional samples of osteoarthritis and 
rheumatoid arthritis joint tissues were obtained from Clinomics. Normal control tissues 
were supplied by Clinomics and were obtained during autopsy of trauma victims. 

Surgical specimens of psoriatic tissues and adjacent matched tissues were provided 
as total RNA by Clinomics. Two male and two female patients were selected between the 
ages of 25 and 47. None of the patients were taking prescription drugs at the time samples 
were isolated. 

Surgical specimens of diseased colon from patients with ulcerative colitis and 
Crohns disease and adjacent matched tissues were obtained from Clinomics. Bowel tissue 
from three female and three male Crohn's patients between the ages of 41-69 were used. 
Two patients were not on prescription medication while the others were taking 
dexamethasone, phenobarbital, or tylenol. Ulcerative colitis tissue was from three male and 
four female patients. Four of the patients were taking lebvid and two were on 
phenobarbital. 

Total RNA from post mortem lung tissue from trauma victims with no disease or 
with emphysema, asthma or COPD was purchased from Clinomics. Emphysema patients 
ranged in age from 40-70 and all were smokers, this age range was chosen to focus on 
patients with cigarette-linked emphysema and to avoid those patients with 
alpha-lanti-trypsin deficiencies. Asthma patients ranged in age from 36-75, and excluded 
smokers to prevent those patients that could also have COPD. COPD patients ranged in age 
from 35-80 and included both smokers and non-smokers. Most patients were taking 
corticosteroids, and bronchodilators. 

In the labels employed to identify tissues in the ALcomprehensive paneLvl.O 
panel, the following abbreviations are used: 
AI = Autoimmunity 
Syn = Synovial 

Normal = No apparent disease 
Rep22 /Rep20 = individual patients 
RA = Rheumatoid arthritis 
Backus = From Backus Hospital 
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OA = Osteoarthritis 

(SS) (BA) (MF) = Individual patients 

Adj =s Adjacent tissue 

Match control = adjacent tissues 

-M = Male 

~F = Female 

COPD = Chronic obstructive pulmonary disease 
AI.05 chondrosarcoma 

The AI.05 chondrosarcoma plates are comprised of SW1353 cells that had been subjected 
to serum starvation, and treatment with cytokines that are known to induce MMP (1, 3 and 13) 
synthesis (eg. ILlbeta). These treatments include: IL-ip (10 ng/ml), IL-1(J + TNF-ct (50 ng/ml), 
IL-1P + Oncostatin (50 ng/ml) and PMA (100 ng/ml). The SW1353 cells were obtained from 
ATCC (American Type Culture Collection) and were all cultured under standard recommended 
conditions. The SW1353 cells were plated at 3 xlO 5 cells/ml (in DMEM medium-10 % FBS) 
in 6-well plate. The treatment was done in triplicate, for 6 and 18 h. The supernatants were 
collected for analysis of MMP 1, 3 and 13 production and for RNA extraction. RNA was prepared 
from these samples using the standard procedures. 

Panels 5D and SI 

The plates for Panel 5D and 51 include two control wells and a variety of cDNAs 
isolated from human tissues and cell lines with an emphasis on metabolic diseases. 
Metabolic tissues were obtained from patients enrolled in the Gestational Diabetes study. 
Cells were obtained during different stages in the differentiation of adipocytes from human 
mesenchymal stem cells. Human pancreatic islets were also obtained. 

In the Gestational Diabetes study subjects are young (18-40 years), otherwise 
healthy women with and without gestational diabetes undergoing routine (elective) 
Caesarean section. After delivery of the infant, when the surgical incisions were being 
repaired/closed, the obstetrician removed a small sample (<1 cc) of the exposed metabolic 
tissues during the closure of each surgical level. The biopsy material was rinsed in sterile 
saline, blotted and fast frozen within 5 minutes from the time of removal. The tissue was 
then flash frozen in liquid nitrogen and stored, individually, in sterile screw-top tubes and 
kept on dry ice for shipment to or to be picked up by CuraGen. The metabolic tissues of 
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interest include uterine wall (smooth muscle), visceral adlpJs^ 
subcutaneous adipose. Patient descriptions are as follows: 

Patient 2: Diabetic Hispanic, overweight, not on insulin 

Patient 7-9: Nondiabetic Caucasian and obese (BMI>30) 
5 Patient 10: Diabetic Hispanic, overweight, on insulin 

Patient 11: Nondiabetic African American and overweight 

Patient 12: Diabetic Hispanic on insulin 

Adiocyte differentiation was induced in donor progenitor cells obtained from Osirus 
(a division of Clonetics/BioWhittaker) in triplicate, except for Donor 3U which had only 

10 two replicates. Scientists at Clonetics isolated, grew and differentiated human 

mesenchymal stem cells (HuMSCs) for CuraGen based on the published protocol found in 
Mark F. Pittenger, et al., Multilineage Potential of Adult Human Mesenchymal Stem Cells 
Science Apr 2 1999: 143-147. Clonetics provided Trizol lysates or frozen pellets suitable 
for mRNA isolation and ds cDNA production. A general description of each donor is as 

15 follows: 

Donor 2 and 3 U: Mesenchymal Stem cells, Undifferentiated Adipose 
Donor 2 and 3 AM: Adipose, AdiposeMidway Differentiated 
Donor 2 and 3 AD: Adipose, Adipose Differentiated 

Human cell lines were generally obtained from ATCC (American Type Culture 
20 Collection), NCI or the German tumor cell bank and fall into the following tissue groups: 
kidney proximal convoluted tubule, uterine smooth muscle cells, small intestine, liver 
HepG2 cancer cells, heart primary stromal cells, and adrenal cortical adenoma cells. These 
cells are all cultured under standard recommended conditions and RNA extracted using the 
standard procedures. All samples were processed at CuraGen to produce single stranded 
25 cDNA. 

Panel 51 contains all samples previously described with the addition of pancreatic 
islets from a 58 year old female patient obtained from the Diabetes Research Institute at the 
University of Miami School of Medicine. Islet tissue was processed to total RNA at an 
outside source and delivered to CuraGen for addition to panel 5L 

30 In the labels employed to identify tissues in the 5D and 51 panels, the following 

abbreviations are used: 
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GO Adipose = Greater Omentum Adipose 
SK = Skeletal Muscle 
UT = Uterus 
PL = Placenta 

AD = Adipose Differentiated 
AM = Adipose Midway Differentiated 
U = Undifferentiated Stem Cells 
Panel CNSD.01 

The plates for Panel CNSD.01 include two control wells and 94 test samples 
comprised of cDNA isolated from postmortem human brain tissue obtained from the 
Harvard Brain Tissue Resource Center. Brains are removed from calvaria of donors 
between 4 and 24 hours after death, sectioned by neuroanatomists, and frozen at -80°C in 
liquid nitrogen vapor. All brains are sectioned and examined by neuropathologists to 
confirm diagnoses with clear associated neuropathology. 

Disease diagnoses are taken from patient records. The panel contains two brains 
from each of the following diagnoses: Alzheimer's disease, Parkinson's disease, 
Huntington's disease, Progressive Supernuclear Palsy, Depression, and "Normal controls". 
Within each of these brains, the following regions are represented: cingulate gyrus, 
temporal pole, globus palladus, substantia nigra, Brodman Area 4 (primary motor strip), 
Brodman Area 7 (parietal cortex), Brodman Area 9 (prefrontal cortex), and Brodman area 
17 (occipital cortex). Not all brain regions are represented in all cases; e.g., Huntington's 
disease is characterized in part by neurodegeneration in the globus palladus, thus this 
region is impossible to obtain from confirmed Huntington's cases. Likewise Parkinson's 
disease is characterized by degeneration of the substantia nigra making this region more 
difficult to obtain. Normal control brains were examined for neuropathology and found to 
be free of any pathology consistent with neurodegeneration. 

In the labels employed to identify tissues in the CNS panel, the following 
abbreviations are use± 

PSP = Progressive supranuclear palsy 
Sub Nigra = Substantia nigra 
Glob Palladus= Globus palladus 
Temp Pole = Temporal pole 
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Cing Gyr = Cingulate gyrus 
BA 4 = Brodman Area 4 

Panel CNS„Neurodegeneration_V1.0 

The plates for Panel CNS JNeurodegenerationJVl .0 include two control wells and 
47 test samples comprised of cDNA isolated from postmortem human brain tissue obtained 
from the Harvard Brain Tissue Resource Center (McLean Hospital) and the Human Brain 
and Spinal Fluid Resource Center (VA Greater Los Angeles Healthcare System). Brains are 
removed from calvaria of donors between 4 and 24 hours after death, sectioned by 
neuroanatomists, and frozen at -80°C in liquid nitrogen vapor. All brains are sectioned and 
examined by neuropathologists to confirm diagnoses with clear associated neuropathology. 

Disease diagnoses are taken from patient records. The panel contains six brains 
from Alzheimer's disease (AD) patients, and eight brains from "Normal controls" who 
showed no evidence of dementia prior to death. The eight normal control brains are divided 
into two categories: Controls with no dementia and no Alzheimer's like pathology 
(Controls) and controls with no dementia but evidence of severe Alzheimer's like 
pathology, (specifically senile plaque load rated as level 3 on a scale of 0-3; 0 = no 
evidence of plaques, 3 = severe AD senile plaque load). Within each of these brains, the 
following regions are represented: hippocampus, temporal cortex (Brodman Area 21), 
parietal cortex (Brodman area 7), and occipital cortex (Brodman area 17). These regions 
were chosen to encompass all levels of neurodegeneration in AD. The hippocampus is a 
region of early and severe neuronal loss in AD; the temporal cortex is known to show 
neurodegeneration in AD after the hippocampus; the parietal cortex shows moderate 
neuronal death in the late stages of the disease; the occipital cortex is spared in AD and 
therefore acts as a "control" region within AD patients. Not all brain regions are 
represented in all cases. 

In the labels employed to identify tissues in the CNSJNfeurodegeneration_V1.0 
panel, the following abbreviations are used: 

AD = Alzheimer's disease brain; patient was demented and showed AD-like 

-i 

pathology upon autopsy 

Control = Control brains; patient not demented, showing no neuropathology 
Control (Path) = Control brains; pateint not demented but showing sever AD-like 
pathology 
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SupTemporal Ctx = Superior Temporal Cortex 
Inf Temporal Ctx = Inferior Temporal Cortex 

A. CG106764-01: RHO/RAC-INTERACTING CITRON KINASE. 

5 Expression of gene CG106764-01 was assessed using the primer-probe set Ag2100, 

described in Table AA. Results of the RTQ-PCR runs are shown in Tables AB, AC, AD, 
AE, AF, AG, AH and AI. 

Table AA, Probe Name Ag210Q 

10 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -agatccctggaacagaggatt-3 ' 


21 


2446 


249 


Probe 


TET-5 ' -tgtctgaagccaataaacttgcagca 
-3 1 -TAMRA 


26 


2474 


250 


Reverse 


5 • -ccttcatgttcctttgggtaa-3 1 


21 


2513 


251 



Table AB, AI.05 chondrosarcoma 



Tissue Name 


Rel. 

Exp.(%) 

g2100, 

Run 

306913849 


Tissue Name 


Rel. 

Exp.(%) 
Ag2100, 
Run 

306913849 


138353JPMA (18hrs) 


9.3 


138346J0L-lbeta + Oncostatin M 
(6hrs) 


64.2 


138352JDL-lbeta + Oncostatin M 
(18hre) 


5.5 


138345_BL-lbeta+TNFa (6hrs) 


44.8 


138351J0L.-lbeta+TNFa (18hrs) 


12.5 


138344JL-lbeta (6hrs) 


25.5 


138350_IL-lbeta(18hrs) 


12.5 


138349_Untreated-serum starved 
(6hrs) 


100.0 


138354_Untreated-complete 
medium (18hrs) 


13.2 


138348_Untreated-complete 
medium (6hrs) 


41.2 


138347JPMA (6bxs) 


34.9 
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Table AC. AI comprehensive panel vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag2100, 
Run 

211059880 


Rel. 

Exp.(%) 
Ag2100, 
Run 

212328504 


issue Name 


Rel. 

Exp.(%) 
Ag2100, 
Run 

211059880 


Rel. 

Exp.(%) 
Ag2100, 
Run 

212328504 


1 10967 COPD-F 


0.5 


0.8 


112427 Match Control 
Psoriasis-F 


2.9 


1.8 


1 10980 COPD-F 


1.5 


1.2 


112418 Psoriasis-M 


0.8 


0.8 




ft A 


U.O 


112723 Match Control 
Psoriasis-M 


6.1 


7.4 


110977 COPD-M 


1.5 


1.9 


112419 Psoriasis-M 


1.0 


1.3 


110989 

Emphysema-F 


4.2 


6.0 


1 12424- Match Control 
Psoriasis-M 


0.4 


1.2 


110992 

Emphysema-F 


2.8 


2.9 


112420 Psoriasis-M 


1.8 


2.4 


1 10993 

Emphysema-F 


0.9 


0.8 


1 12425 Match Control 
Psoriasis-M 


2.2 


2.7 


110994 

Emphysema-F 


0.7 


0.4 


104689 (MF) OA 
Bone-Backus 


12.1 


13.2 


110995 

Emphysema-F 


2.0 


5.4 


104690 (MF) Adj 

"Normal" 

Bone-Backus 


54 


4? 


110996 

Emphysema-F 


2.2 


2.4 


104691 flVTF>OA 
Synovium-Backus 


43.2 


35.6 


1 10997 Asthma-M 


1.9 


3.1 


104692 CBA} OA 
Cartilage-Backus 


0.9 


0.4 


111001 Asthma-F 


1.4 


2.7 


104694 (BA) OA 

B on e-B a clcu ^ 


16.8 


16.7 


111002 Asthma-F 


1.0 


1.0 


104695 (BA) Adj 

"Normal" 

Bone-Backus 


65 


6 1 


111003 Atopic 1 
Asthma-F 


4.0 


2.2 


104696 fBA)OA 
Synovium-Backus 


24.0 


24.1 


111004 Atopic ! 
Asthma-F \ 


16.6 


17.0 


104700 (SS) OA 
Bone-Backus 


12.2 


35.1 


111005 Atopic 
Asthma-F 


7.2 


5.5 


104701 (SS)Adj 

"Normal" 

Bone-Backus 


7.9 


9.5 


111006 Atopic 
Asthma-F 


0.9 


0.7 


104702 (SS) OA 
Synovium-Backus 


8.2 


7.9 


111417 AIlergy-M 


1.9 


2.4 


117093 OA Cartilage 
Rep7 


2.0 


2.3 


112347 Allergy-M 


O.O 


0.1 


112672 OA Bone5 


1.9 


}.8 


112349 Normal 
Lung-F 


D.O 


3.0 


112673 OA 
Synoviums 


}.3 


1.2 
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1 12357 Normal 
Lung-F 


6.1 


6.0 


t — — — bmPhI 
112674 OA Synovial 
Fluid cells5 


.•• 1,.f s.41 tut IU 

0.5 


0.4 


'•"It 


1 12354 Normal 
Lung-M 


1.5 


2.3 


117100 OA Cartilage 
Repl4 


0.4 


0.3 


1 12374 Crohns-F 


2.9 


5.2 


112756 OABone9 


100.0 


100.O 




112389 Match 
Control Crohns-F 


9.0 


6.8 


112757 OA 
Synovium9 


0.5 


0.2 




112375 Crohns-F 


2.5 


3.8 


112758 OA Synovial 
Fluid Cells9 


0.8 


1.5 




112732 Match 
Control Crohns-F 


3.8 


5.4 


117125 RA Cartilage 
Rep2 


1.0 


0.6 




112725 Crohns-M 


0.1 


0.7 


113492 Bone2RA 


2.8 


3.6 




112387 Match 
Control Crohns-M 


1.0 


1.4 


113493 Synovium2 
RA 


1.7 


07 




112378 Crohns-M 


0.0 


0.0 


113494 Syn Fluid 
Cells RA 


0.9 


2.1 




112390 Match 
Control Crohns-M 


2.5 


1.8 


1 13499 Cartilage4RA 


2.1 


1.8 


112726 Crohns-M 


3.8 


5.9 


1 13500 Bone4RA 


1.8 


2.5 


112731 Match 
Control Crohns-M 


3.6 


6.7 


113501 Synovium4 
RA 


2.1 


2.3 




112380 Ulcer 
Col-F 


4.9 


4.9 


113502 Syn Fluid 
Cells4RA 


1.0 


0.8 


112734 Match 
Control Ulcer 
Col-F 


12.6 


12.0 


113495 Cartilage3RA 


2.5 


2.6 


112384 Ulcer 
Col-F 


6.6 


10.2 


1 13496 Bone3RA 


2.0 


2.1 


112737 Match 
Control Ulcer 
Col-F 


4.2 


6.1 


113497 Synovium3 
RA 


1.4 


1.4 


112386 Ulcer 
Col-F 


0.5 


1.2 


113498 Syn Fluid 
Cells3RA 


2.9 


3.2 


112738 Match 
Control Ulcer 
Col-F 


7.5 


7.9 


117106 Normal 
Cartilage Rep20 


0.1 


0.7 


112381 Ulcer 
CoJ-M 


0.1 


0.1 


1 13663 Bone3 Normal 


0.3 


0.1 


112735 Match 
Control Ulcer 
Col-M 


2.9 


2.3 


1 13664 Synovium3 
Normal 


0.0 


D.0 


112382 Ulcer 
Col-M 


6.7 


8.4 


11 3665 Syn Fluid 
Cells3 Normal 


0.1 


3.2 


112394 Match 
Control Ulcer 
Col-M 


0.5 


0.5 


1 17 107 Normal 
Cartilage Rep22 


5.9 


).3 


112383 Ulcer 
Col-M 


12.1 


14.6 ] 


1 13667 Bone4 Normal < 


0.4 ( 


).7 
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SPHgrF 

113668 Synovium4 
Normal 



1.0 - 1.1 



112736 Match 
Control Ulcer 
Col-M 



3.5 



5.3 



112423 Psoriasis-F 



1.4 



1.1 



1 13669 Syn Fluid 
Cells4 Normal ■ 



1.0 



0.7 



Table AD. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Then (9?ti\ 

Ag2100, 
Run 

207929343 


issue Name 


Rel. 

JCja|/«\ /O ) 

Ag2100, 
Run 

207929343 


AD 1 Hippo 


5.2 


Control (Path) 3 Temporal Ctx 


8.5 


AD 2 Hippo 


9.3 


Control (Path) 4 Temporal Ctx 


55.5 


AD 3 Hippo 


6.7 


AD 1 Occipital Ctx 


3L6 


AD 4 Hippo 


7.2 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


100.0 


AD 3 Occipital Ctx 


8.4 


AD 6 Hippo 


16.5 


AD 4 Occipital Ctx 


28.7 


Control 2 Hippo 


17.7 


AD 5 Occipital Ctx 


52.5 


Control 4 Hippo 


3.4 


AD 6 Occipital Ctx 


22.8 


Control (Path) 3 Hippo 


4.4 


Control 1 Occipital Ctx 


3.9 


AD 1 Temporal Ctx 


15.7 


Control 2 Occipital Ctx 


64.6 


AD 2 Temporal Ctx 


26.4 


Control 3 Occipital Ctx 


40.6 


AD 3 Temporal Ctx 


12.3 


Control 4 Occipital Ctx 


6.4 


AD 4 Temporal Ctx 


24.3 


Control (Path) 1 Occipital Ctx 


77.9 


AD 5 M Temporal Ctx 


65.5 


Control (Path) 2 Occipital Ctx 


28.5 


AD 5 Sup Temporal Ctx 


20.9 


Control (Path) 3 Occipital Ctx 


1.5 


AD 6 Inf Temporal Ctx 


44.1 


Control (Path) 4 Occipital Ctx 


40.9 


AD 6 Sup Temporal Ctx 


59.0 


Control 1 Parietal Ctx 


7.8 


Control 1 Temporal Ctx 


9.5 


Control 2 Parietal Ctx 


34.4 


Control 2 Temporal Ctx 


34.6 


Control 3 Parietal Ctx 


15.8 


Control 3 Temporal Ctx 


0.0 


Control (Path) 1 Parietal Ctx 


68.8 


Control 3 Temporal Ctx \ 


10.4 


Control (Path) 2 Parietal Ctx 


32.3 


Control (Path) 1 Temporal Ctx 


68.8 


Control (Path) 3 Parietal Ctx 


4.9 ! 


Control (Path) 2 Temporal Ctx |49.7 |Control (Path) 4 Parietal Ctx (58.6 
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Table AE. Panel 1.3D 



Tissue Name 


ReL 

Exp. %) 
Ag2100, 
Kua 

152517508 


Tissue Name 


Rel. 

Exp.(%) 
Ag2100, 
xvun 

152517508 


Liver adenocarcinoma 


11.7 


Kidney (fetal) 


1.8 


Pancreas 


0.0 


Renal ca. 786-0 


7.1 


Pancreatic ca. CAPAN 2 


3.2 


Renal ca. A498 


3.7 


Adrenal gland 


1.4 


Renal ca. RXF 393 


3.1 


Thyroid 


0.1 


Renal ca. ACHN 


4.4 


Salivary gland 


0.1 


Renal ca. UO-31 


6.3 


Pituitarv eland 


2.1 


Renal ca. TK-10 


3.2 


Brain CfetalY 

UX Hill ^iwuuy 


2.1 


Liver 


0.0 


Brain (whole) 


24.7 


Liver (fetal) 


3.8 


Brain ( amvpdala^ 


11.2 


Liver ca. fhenatoblasO HenG2 


3.2 


Brain ( cerehellunri 


2.7 


T. 11 no 


0.3 


Brain fhinnnramniiO 

XII dill ^ill^jJVJUOXllL/UO^ 


36.3 


Lunp Tfetal^ 


0.9 


Brain ( <;iih«stantia niera^ 


1.5 


Lune: ca Ismail celD LX-1 


6.6 


Brain f thalamus^ 


30.4 


Lung ca. (small cell) NCI-H69 




("jprphral Onrtp.x 


100.0 


Lunp ca (& cell var } SHP-77 


7.5 


Sninal cord 


2.5 


Luneca flaree cellWCI-H460 


OX) 


fflio/astro U87-MG 


6.4 


Lune ca. fnon-sm celD A549 


02 " ™ 


clio/astro U-l 18-MG 


33.7 


Lurig ca. (non-s.cell) NCI-H23 


10.4 


astrocytoma SW1783 


5.9 


Lung ca. (non-s.cell) HOP-62 


1.4 


neuro*; met SK-N-AS 


14.5 


Lung ca. (non-s.cl) NCI-H522 


5.3 


astrocvtoma SF-539 


7.4 


Lung ca. (squam.) SW 900 


3.2 


astrocytoma SNB-75 


5.8 


Lung ca. (squam.) NCI-H596 


7.2 


glioma SNB-19 


1.0 


Mammary gland 


0.2 


gliomaU25 1 


2.4 


Breast ca.* (pl.ef) MCF-7 


5.6 


glioma SF-295 


0.9 I 


Breast ca.* (pl.ef) MDA-MB-231 


14.5 


Heart (fetal) 


0.4 


Breast ca.* (pl.ef) T47D 


2.4 


Heart 


0.1 


Breast ca. BT-549 


6.8 


Skeletal muscle (fetal) 


3.4 


Breast ca. MDA-N 


14.0 


Skeletal muscle 


0.1 


Ovary 


2.2 


Bone marrow 


5.4 


Ovarian ca. OVCAR-3 


2.5 


Thymus 


2.1 


Ovarian ca. OVCAR-4 


0.8 


Spleen 


0.6 


Ovarian ca. OVCAR-5 


2:7 


Lymph node 


0.4 


Ovarian ca. OVCAR-8 


3.2 


Colorectal 


1.8 


Ovarian ca.IGROV-1 


2.0 


Stomach 


1.0 


Ovarian ca* (ascites) SK-OV-3 


7.4 


Small intestine 


1.6 


Uterus 


0.0 | 


Colon ca. SW480 


13.1 


Placenta 


0.2 
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Colon ca * SW620(SW480 meri 


4 S 


rrostate 


r n ..it p ~.v ~ 
U-Z 


Colon ca HT29 


4 1 


i rosidie Ld. ^oonc met jx\^,—3 


o n 
z.u 


Colon ca HCT-116 




Tactic 


't-U 


Colon ca. CaCo-2 


5.9 


Melanoma Hs688(A).T 


0.7 


Colon ca. tissue(OD03866) 


2.8 


Melanoma* (met) Hs688(B).T 


0.3 


Colon ca. HCC-2998 


3.7 


Melanoma UACC-62 


0.5 


Gastric ca * (liver met) NCI-N87 


2.3 


Melanoma M14 


7.2 


Bladder 


0.9 


Melanoma LOX IMVI 


2.8 


Trachea 


0.7 


Melanoma* (met) SK-MEL-5 


5.8 


Kidney 


0.7 


Adipose 


0.2 



Table AF. Panel 2.2 



5 





ReL 

Ep.(%) 

Run 

174166901 


TlCCll A Wotno 

1 lasue iiaiiie 


ReL 

Exp.(%) 

A nil fift 

Run 

174166901 


Normal Colon 


6.3 


Kidney Margin (OD04348) 


30.4 


Colon cancer (0006064"! 


13 4 


Kidney malignant cancer 
(OD06204B) 




Colon Margin (OD06064) 


9.0 


Kidney normal adjacent tissue 
(OD06204E) 


10.5 


Colon cancer (OD06159) 


4.5 


Kidney Cancer (OD04450-01) 


2.4 ! 


Colon Margin (OD06159) 


5.9 


Kidney Margin (OD04450-03) 


13.3 


Colon cancer (OD06297-04) 


3.8 


Kidney Cancer 8120613 


6.7 - 


Colon Margin (OD06297-05) 


9.9 


Kidney Margin 8120614 


1.2 


CC Gr.2 ascend colon (OD03921) 


4.4 


Kidney Cancer 9010320 


1.7 


CC Margin (OD03921) 


2.8 


Kidney Margin 9010321 


4.5 


Colon cancer metastasis 
(OD06104) 


1.7 


Kidney Cancer 8 120607 


0.5 


Lung Margin (OD06104) 


3.1 


Kidney Margin 8120608 


1.7 


Colon mets to lung (OD04451-01) 


9.6 


Normal Uterus 


1.1 


Lung Margin (OD04451-02) 


3.2 


Uterine Cancer 064011 


1.5 


Normal Prostate 


1.2 


Normal Thyroid 


0.0 


Prostate Cancer (OD04410) 


0.0 


Thyroid Cancer 064010 


0.6 


Prostate Margin (OD04410) 


0.7 


Thyroid Cancer A302152 


5.3 


Normal Ovary 


2.8. 


Thyroid Margin A302153 


0.0 | 


Ovarian cancer (OD06283-03) 


11.7 


Normal Breast 


3.0 


Ovarian Margin (OD06283-07) 


3.0 


Breast Cancer (OD04566) 


8.1 


Ovarian Cancer 064008 


1.1 


Breast Cancer 1024 


2.9 


Ovarian cancer (OD06145) 


0.9 


Breast Cancer (OD04590-01) 


14.8 


Ovarian Margin (OD06145) 


0.0 


Breast Cancer Mets 
[OD04590-03) 


3.2 
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Ovarian cancer (OD06455-03) 


15.8 


Breast Cancer M'etastasis^ U 
(OD04655-05) 


* JL «3 J 
5.4 


Ovarian Margin (OD06455-07) 


L8 


Breast Cancer 064006 


3.1 


Normal Lung 


1.2 


Breast Cancer 9100266 


2.6 


Invasive poor diff. lung adeno 
(OD04945-01 


8.4 


Breast Marcnn Q 1 




Lung Margin (ODO4945-03) 


1.2 


Breast Cancer A209073 


1.8 


Lung Malignant Cancer 

\\JU\J3 L£Aj) 


5.0 


Breast Margin A2090734 


2.5 


t-ung iviargin \\JU\jd iZo) 


0.6 


Breast cancer (OD06083) 


17.1 


. Lung Cancer (OD05014A) 


10.2 


Breast cancer node metastasis 


14.7 


Lung Margin (OD050I4B) 


9.0 


l^Jrvrroal T i \rc*r 

xNuiiiiaJ juivei 


A A 
U.4 


Lung cancer ( OD0608 1 ) 


10 1 


uver cancer i uzo 


0.0 


Lung Margin (OD06081) 


4 0 


jL.ivcr i^ancer iuzj 


1.8 


Lung Cancer (OD04237-01) 


4.1 


juiver cancer ouiki- i 


i.i 


Lung Margin (OD04237-02) 


2.0 


Liver Tissue 6004-N 


2.5 


Ocular Melanoma Metastasis 


0.9 


Liver Cancer 6005-T 


1.6 


Ocular Melanoma Margin (Liver) 


0.4 


Liver Tissue 6005-N 


0.0 


Melanoma Metastasis 


10.4 


Liver Cancer 064003 


0.7 


Melanoma Margin (Lung) 


2.0 


Normal Bladder 


2.9 


Normal Kidney j 


5.0 


Bladder Cancer 1023 


1.5 


Kidney Ca, Nuclear grade 2 j 
(OD04338) 


15 4 


oiauaer cancer /\oUZl / j 


17.8 


Kidney Margin (OD04338) 


5.0 


Normal Stomach 


10.4 


Kidney Ca Nuclear grade 1/2 


100.0 


Gastric Cancer 9060397 


1.1 


Kidney Margin (OD04339) 


?.3 


Stomach Margin 9060396 ( 


).7 


Kidney Ca, Clear cell type 
(OD04340) 


14.0 ( 


3astric Cancer 9060395 ; 


'.8 


Kidney Margin (OD04340) 


LL3 < 


Stomach Margin 9060394 ^ 


!.8 


Kidney Ca, Nuclear grade 3 
(OD04348) 


KO ( 


jastric Cancer 064005 t 


i.0 



l3 



Table AG. Panel 3D 



Tissue Name 


Rel. 

Exp(%) 
Ag2100, 
Run 

164796104 


Tissue Name 


Rel. 

Exp.(%) 
Ag2100, 
Run 

164796104 


Daoy- Medulloblastoma 


7.3 


Ca Ski- Cervical epidermoid 
carcinoma (metastasis) 


21.0 


TE671- Medulloblastoma 


3.8 


ES-2- Ovarian clear cell carcinoma j 


11.7 
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D283 Med- Medulloblastoma 


jPMA/ionomycin 6h 


1 A n 

10.8 


PFSK-1- Primitive 
Neuroectodermal 


22 2 jRamos- Stimulated with 
}PMA/ionomycin 14h 


6.2 


XF-498- CNS 


21.2 


MEG-01- Chronic myelogenous 
leukemia (megokaryoblast) 


5.8 


SNB-78- Glioma 


11.3 


Raji- Burkitfs lymphoma 


6.7 


SF-268- Glioblastoma 


7.6 


Daudi- Burkitt's lymphoma 


14.8 


T98G- Glioblastoma 


12.0 


U266- B-cell plasmacytoma 


5.1 


SK-N-SH- Neuroblastoma 
(metastasis) 


5.6 


CA46- Burkitt's lymphoma 


5.0 


SF-295- Glioblastoma 


12.4 


RL- non-Hodgkin's B-cell 
lymphoma 


3.8 


Cerebelluin 


16.2 


JM1- pre-B-cell lymphoma 


11.5 


Cerebellum 


3.6 


Jurkat- T cell leukemia 


12.5 


NCI-H292- Mucoepidermoid 
lung carcinoma 


14.0 


TF-1- Erythroleukemia 


9.9 


DMS-114- Small cell lung 
cancer 


10.4 


HUT 78- T-cell lymphoma 


14.7 


DMS-79- Small cell lung cancer 


100.0 


U937- Histiocytic Ivmnhoma 


8.1 


NCI-H146- Small cell lung 
cancer 


14.3 


KU-812- Myelogenous leukemia 


17.7 


NCI-H526- Small cell lung 
cancer 


19.8 


769-P- Clear cell renal carcinoma 


6.3 


NCI-N417- Small cell lung 
cancer 


5.8 


Caki-2- Qear cell renal carcinoma 


9.5 


NCI-H82- Small cell lung cancer 


10.2 


SW 839- Clear cell renal carcinnma 


5.2 


NCI-H157- Squamous cell lung 
cancer (metastasis) 


13.8 


G401- Wilms' tumor 


6.3 


NCI-HI 155- Large cell lung 
cancer 


36.1 


Hs766T- Pancreatic carcinoma (LN 
metastasis) 


15.7 


NCI-H1299- Large cell lung 
cancer 


22.7 


CAPAN-1- Pancreatic 
adenocarcinoma (liver metastasis) 


8.6 


NCI-H727- Lung carcinoid 


f A A 

14.4 


SU86.86- Pancreatic carcinoma 
liver metastasis) 


14.1 


NCI-UMC-1 1- Lung carcinoid : 


*5.9 


BxPC-3- Pancreatic 
adenocarcinoma 




LX-1- Small cell lung cancer 3 


L1.0 ] 


SPAC- Pancreatic adenocarcinoma 1 


14.5 


Colo-205- Colon cancer ] 


12.7 


VGA PaCa-2- Pancreatic carcinoma 1 


>.6 


KM 1 2- Colon cancer ] 


7.2 ( 

i 


3FPAC-1- Pancreatic ductal 
tdenocarcinoma 


18.7 


KM20L2- Colon cancer 1 


.0 1 

C 


> ANC-1- Pancreatic epithelioid . 
uctal carcinoma 


9.5 


NCI-H716- Colon cancer 1 


9.5 1 

c 


"24- Bladder carcinma (transitional 
ell) y 


.0 


SW-48- Colon adenocarcinoma 1 


0.6 5 


637- Bladder carcinoma 1 


0.5 


SW1 1 16- Colon adenocarcinoma 7 


.7 I- 


[T-l 1 97- Bladder carcinoma 4 


.8 
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LS 174T- Colon adenocarcinoma 


9.8 


UM-UC-3- Madder ^nSP ' 
(transitional cell) 


13.3 


SW-948- Colon adenocarcinoma 


1.4 


A204- Rhabdomyosarcoma 


15.2 


SW-480- Colon adenocarcinoma 


7.6 


HT-1080- Fibrosarcoma 


11.9 


NCI-SNU-5- Gastric carcinoma 


14.9 


MG-63- Osteosarcoma 


7.3 


KATO HI- Gastric carcinoma 


18.8 


SK-LMS-l- Leiomyosarcoma 
(vulva) 


AO f\ 

48.0 


NCI-SNU-16- Gastric carcinoma 


12.6 


SJRH30- Rhabdomyosarcoma (met 
to bone marrow) 


10.2 


NCI-SNU-1- Gastric carcinoma 


123 


A431- Epidermoid carcinoma 


12.2 


RF-1- Gastric adenocarcinoma 


5.3 


WM266-4- Melanoma 


21.9 


RF-48- Gastric adenocarcinoma 


7.6 


DU 145- Prostate carcinoma (brain 
metastasis) 


0.2 


main -so- uastnc carcinoma 


11.7 


MDA-MB-468- Breast 
adenocarcinoma 


5.6 


NCI-N87- Gastric carcinoma 


9.3 


SCG4- Squamous cell carcinoma 
of tongue 


U.J 


OVCAR-5- Ovarian carcinoma 


3.0 


SCC-9- Squamous cell carcinoma 
of tongue 


0.3 


RL95-2- Uterine carcinoma 


4.5 


SCC-15- Squamous cell carcinoma 
of tongue 


0.2 


HelaS3- Cervical 
adenocarcinoma 


9.0 


CAL 27- Squamous cell carcinoma 
of tongue 


19.9 



Table AH. Panel 4D 



Tissue Name 


Rel. 

Exp(%) 
Ag2100, 
Run 

152800279 


Tissue Name 


Rel. 

Exp.(%) 
Ag2100, 
Run 

152800279 


Secondary Thl act 


15.4 


HUVEClL-lbeta 


12.2 


Secondary Th2 act 


11.9 


HUVECIFN gamma 


16.6 


Secondary Trl act 


15.6 


HUVEC TNF alpha + IFN gamma 


11.8 


Secondary Thl rest 


4.9 


HUVEC TNF alpha + IL4 


11.4 


Secondary Th2 rest 


3.3 


HUVEC IL-11 


8.2 


Secondary Trl rest 


6.0 


Lung Microvascular EC none 


7.3 


Primary Thl act 


13.6 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


6.3 


Primary Th2 act 


12.0 


Microvascular Dermal EC none 


23.3 j 


Primary Trl act 


22.2 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


10.5 


Primary Thl rest 


100.0 


Bronchial epithelium TNFalpha + 
CLlbeta 


0.6 


Primary Th2 rest 


37.9 


Small airway epithelium none 


1.6 
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Primary Trl rest 


29.3 


Small ai^pi^liMiWIlpria 
+ IL-lbeta 


-J» uAU -wJf Jr *uJ 

7.4 


CD45RA CD4 lymphocyte act 


13.6 


Coronery artery SMC rest 


4.4 


GD45RO CD4 lymphocyte act 


15.4 


Coronery artery SMC TNFalpha + 
EL-lbeta 


2.0 


CD8 lymphocyte act 


10.6 jAstrocytes rest 


1.3 


Secondary CD8 lymphocyte rest 


7.9 


Astrocytes TNFalpha + IL-lbeta 


0.5 


Secondary CD8 lymphocyte act 


17.3 


KU-812 (Basophil) rest 


22.4 


CT~)4 Tvrrmhnrvte none* 


0.5 


KU-8 12 (Basophil) 
PMA/i onomycin 


28.5 


2iy Thl/Th2A , rUanti-CD95 
CH11 


17.1 


CCD1106 (Keratinocytes) none 


14.3 


LAK cells rest 


3.6 


CCD1106 (Keratinocytes) 
TNFalnha 4- TT -1hpta 


18.4 




iu>o 


T ivpi* f*ii*rTtncic 




T AIT tpIU TT -?4-TT -1 ? 

LjrVEk tCUd JUL* J L»~ 1 C 


O.T 


T unties 1riH"nf»v 

JUU]/UJ XvlVXJit/J' 




T ATT rf>11c TT -/^-i-TPM era mm a 
JUrVCSk Cell a lL,-£-rlIrXy ^dJllilla 


16 4 






T AIT rMIc TT TT -1 8 


16 8 


NCT-H292 TT -4 


11 7 


LAK cells PMA/ionomycin 


06 


NCI-H292 TL-9 


32.3 


NK Cells EL-2 rest 


15.3 


NCI-H292 DL-13 


13.4 


Two Way MLR 3 day 


1.8 


xint tm/v^ ri"'VT 

NCI-H292 EFN gamma 


11.0 


Two Way MLR 5 day 


6.1 


HPAECnone 


8.5 


Two Way MLR 7 day 


10.1 


HPAEC TNF alpha + IL-1 beta 


7.7 


PBMC rest 


0.1 


Lung fibroblast none 


6.3 


PBMCPWM 


25.5 


Lung fibroblast TNF alpha + IL-1 
beta 


9.0 


PBMC PHA-L 


24.0 


Lung fibroblast IL-4 


3.7 


Ramos (B cell) none 


ITT 

17.7 


Lung fibroblast IL-9 


5.0 


Ramos (B cell) ionomycin 


92.0 


Lung fibroblast IL-13 


1.7 


B lymphocytes PWM 


48.6 


T ____ .^Tl- i_ 1 — _ 1 1 vivT _ _ — 

Lung fibroblast 1FN gamma 


3.4 


B lymphocytes CD40L and BL-4 


16.4 


Dermal fibroblast CCD 1070 rest 


57.4 


EOL-1 dbcAMP 


10.5 


jL^ermai noroDiasi. ^v^ljiu/u irsir 1 
alpha 


79.0 


EOL-1 dbcAMP 


7.0 


Dermal fibroblast CCD1070 EL-1 
beta 


21.8 




0.5 


Dermal fibroblast IFN gamma 






0.0 


Dermal fibroblast BL-4 


7 


Dendritic cells anti-CD40 


0.0 


IBD Colitis 2 


0.9 


Monocytes rest 


0.2 


IBD Crohn's 


1.0 


Monocytes LPS 


0.0 


Colon 


3.7 


Macrophages rest 


4.4 


Lung 


1.5 


Macrophages LPS 


0.6 


Thymus 


13.0 


HUVECnone 


24.7 


Kidney 


31.2 


HUVEC starved 


43.5 
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Table AI. Panel CNS 1 





Rel Exp.(%) 

Ag2100, 

Rnn 

171649357 


x issue iName 


Rel. 

Exp.(%) 
AgZlOO, 
Run 

171649357 


BA4 Control 


23.8 


BA17PSP 


35.4 


BA4 Control2 


19.1 


BA17PSP2 


18.3 


B A4 AIzheimer's2 


7.3 


Sub Nigra Control 


11.6 


BA4 Parkinson's 


43.8 


Sub Nigra Control2 


5.0 


BA4Parkinson's2 


60.7 


Sub Nigra. Alzheimer's2 


4.6 


BA4 Huntington's 


23.3 


Sub Nigra Parkinson 's2 


11.8 


BA4 Huntington's2 


14.7 


Sub Nigra Huntington's 


16.0 


BA4PSP 


13.8 


Sub Nigra Huntington's2 


8.8 


BA4PSP2 


26.2 


Sub Nigra PSP2 


1.7 


B A4 Depression 


15.4 


Sub Nigra Depression 


2.7 


BA4 Depression2 


17.0 


Sub Nigra Depression2 


8.0 


BA7 Control 


36.6 


GlobPalladus Control 


8.4 


BA7Control2 


17.4 


Glob Palladus Control2 


10.8 


BA7 Alzheimer 


11.3 


GlobPalladus Alzheimer's 


1.8 


BA7 Parkinson's 


21.9 


GlobPalladus AIzheirner , s2 


8.3 


BA7Parkinson's2 


36.1 


Glob Palladus Parkinson's 


51.1 


BA7 Huntington's 


56.3 


Glob Palladus Parkinson's2 


12.9 


BA7Huntington's2 


45.1 


Glob Palladus PSP 


9.3 


BA7PSP 


44.4 


GlobPalladus PSP2 


9.9 


BA7PSP2 


17.6 


Glob Palladus Depression 




BA7 Depression 


8.5 


Temp Pole Control 


9.8 


BA9 Control 


31.9 


Temp Pole Control2 


21.5 


BA9Control2 


34.4 


Temp Pole Alzheimer's 


6.6 


BA9 Alzheimer's 


8.0 


Temp Pole Alzheirner's2 


8.1 


BA9Alzheimer's2 


20.0 


Temp Pole Parkinson's 


33.0 


BA9 Parkinson's 


40.6 


Temp Pole Parkinson ! s2 


24.8 


BA9Parkihson's2 


31.4 


Temp Pole Huntington's 


33.2 


BA9 Huntington's 


41.5 


Temp Pole PSP 


8.8 j 


BA9Huntington's2 


21.8 


Temp Pole PSP2 


6.0 


BA9 PSP 


17.8 


Temp Pole Depression2 


17.0 


BA9PSP2 


8.2 


Cing Gyr Control 


23.3 


BA9 Depression 


10.5 


Cing Gyr Control2 . 


17.8 


BA9 Depression2 


16.2 


Cing Gyr Alzheimer's 


7.3 


BA17 Control 


58.2 


Cing Gyr AIzheimer's2 


10.4 


BA17 Control2 


H.8 


Cing Gyr Parkinson's 


13.4 


BA17 Alzheimer^ : 


27.0 ( 


Cing Gyr Parkinson's2 J 


17.0 


BA17 Parkinson's ! 


58.6 ( 


Cing Gyr Huntington's \ 


>8.3 
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BA17 Parkinson^ 


69.3 


Cing Gyr Huntington's^ * 




BA17 Huntington's 


44.4 


CingGyrPSP 


7.2 


BA17Huntmgton's2 


31.9 


Cing Gyr PSP2 


4.0 


BA17 Depression 


13.6 


Cing Gyr Depression 


6.9 


BA17 Depression2 


100.0 


Cing Gyr Depression2 


10.4 



AL05 chondrosarcoma Summary: Ag2100 Highest expression of this gene is 
detected in untreated serum starved chondrosarcoma cell line (SW1353) (CT=27). 
5 Interestingly, expression of this gene appears to be somewhat down regulated upon IL-1 
treatment, a potent activator of pro-inflammatory cytokines and matrix metalloproteinases 
which participate in the destruction of cartilage observed in Osteoarthritis (OA). 
Modulation of the expression of this transcript in chondrocytes by either small molecules or 
antisense might be important for preventing the degeneration of cartilage observed in OA 
10 Al^cornprehensive paneLvLO Summary: Ag2100 Highest expression of this 

gene is detected in osteoarthritis (OA) bone (CTs=27-28). This gene is highly expressed in 
bone isolated from 5 different osteoarthritic (OA) patients, synovium in 3 out of 5 OA 
patients, but not in cartilege from OA patients nor in any tissues from rheumatoid arthritis 
(RA) patients or control samples. Thus, small molecule therapeutics designed against the 
15 protein encoded for by this gene could reduce or inhibit inflammation. Anti-sense 

therapeutics that would block the translation of the transcript and protein production could 
also inhibit inflammatory processes. These types of therapeutics could be important in the 
treatment of diseases such as osteoarthritis 

CNSjneurodegeneration_vl.O Summary: Ag2100 This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 
However, no differential expression of this gene was detected between Alzheimer's 
diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1 .3D for a discussion of this gene in treatment of central nervous system 
disorders. 

Panel 1.3D Summary: Ag2100 Expression of this gene is highest in cerebral 
cortex (CT = 26.3). This gene is expressed at moderate levels in all the regions of the CNS 
including amygdala, cerebellum, hippocampus, substantia nigra, thalamus, spinal cord, and 
fetal brain. This gene encodes a protein with homology to citron-kinase. Citron-kinase 
(Citron-K) has been proposed by in vitro studies to be a crucial effector of Rho in 
regulation of cytokinesis. Citron-K is essential for cytokinesis in vivo in specific neuronal 
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precursors and may play a fundamental role in specific 

the CNS (Di Cunto et a!., 2000, Neuron 28:115-127, PMID: 11086988). General inhibitors 
of the RHO/RAC-INTERACTING CITRON KINASE family disrupt endothelial tight 
junctions, suggesting that specific modulators of this brain-preferential family member 
5 could be useful in delivery of therapeutics across the blood brain banien These general 
inhibitors also influence intracellular calcium flux, which is a central component of many 
important neuronal processes, such as apoptosis, neurotransmitter release and signal 
transduction (Jezior et ah, 2001, Br. J. Pharmacol. 134:78-87, PMED: 11522599; Walsh et 
al., 2001 , Gastroenterology 121 :566-579, PMID: 1 1522741). Thus, modulators of the 

10 function of the protein encoded by this gene may prove useful in the treatment of 
neurodegenerative disorders involving apoptosis, such as spinal muscular atrophy, 
Alzheimer's disease, Huntington's disease, Parkinson's disease, and others. Diseases 
involving neurotransmitters or signal transduction, such as schizophrenia, mania, stroke, 
epilepsy and depression may also benefit from agents that modulate the function of the this 

15 gene product. 

This gene also shows moderate to low expression in several metabolic tissues 
including adrenal gland, pituitary gland, gastrointestinal tract, fetal heart, fetal skeletal 
muscle and fetal liver. Therefore, therapeutic modulation of the activity of this gene may 
prove useful in the treatment of endocrine/metabolically related diseases, such as obesity 
20 and diabetes. 

Interestingly, expression of this gene is higher in fetal tissues (CTs=31) as 
compared to the corresponding adult liver, and skeletal muscle (CTs=37-40). This 
observation suggests that expression of this gene can be used to distinguish fetal from adult 
liver and skeletal muscle. In addition, the relative overexpression of this gene in fetal tissue 
25 suggests that the protein product may enhance liver and muscle growth or development in 
the fetus and thus may also act in a regenerative capacity in the adult. Therefore, 
therapeutic modulation of the protein encoded by this gene could be useful in treatment of 
liver and skeletal muscle related diseases. 

Moderate levels of expression of this gene is also seen in cluster of cancer cell lines 
30 derived from pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, 

melanoma and brain cancers. Thus, therapeutic modulation of the expression or function of 
this gene may be effective in the treatment of pancreatic, gastric, colon, lung, liver, renal, 
breast, ovarian, prostate, melanoma and brain cancers. 
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Panel 2.2 Summary: Ag2100 Expression of thiIlCeTs^iyi§ffi% 8 ld'cSfe^Kr 3 
sample (CT=28). In addition, significant expression of this gene is also seen in a number of 
normal and cancer tissues including colon, lung, ovary, breast, kidney, thyroid, liver, 
bladder, and stomach. Interestingly, this gene is expressed at slightly higher levels in most 
5 of the tumors than in the normal matched tissue. Thus, expression of this gene could be 
used to distinguish between cancerous tissue and normal tissue. In addition, therapeutic 
modulation of this gene product, through the use of small molecule drugs or antibodies, 
might be of benefit in the treatment of cancer. 

Panel 3D Summary: Ag2100 Expression of this gene is highest in a lung cancer 
10 cell line (CT = 26). However, low to moderate expression is also seen in the majority of 
cancer ceil lines on this panel, suggesting that this gene may play an important role in many 
cell types. 

Panel 4D Summary: Ag2100 Highest expression of this gene is detected in resting 
primary Thl cells (CT=24.5). Moderate to low levels of expression of this gene is seen in 

15 members of the T-cell, B-cell, endothelial cell, macrophage/monocyte, and peripheral 

blood mononuclear cell family, as well as epithelial and fibroblast cell types from lung and 
skin, and normal tissues represented by colon, lung, thymus and kidney. Interestingly, this 
gene is highly induced in Ramos B cells treated with PMA and ionomycin, in 
non-transformed B cells and PBMC treated with PWM. All three of these observations are 

20 consistent with this gene being induced in B cells after activation. This gene product has 
homology to the RHO/RAC-interacting citron kinase. Thus citron kinase encoded by this 
gene may play an important role in T cell activation, by regulating TCR-mediated T cell 
spreading, chemotaxis and other chemokine responses and in apoptosis. likewise, this 
putative kinase may also be important in B cell motility, antigen receptor mediated 

25 activation and apoptosis. 

Small molecule therapeutics designed against the protein encoded for by this gene 
could reduce or inhibit inflammation. Anti-sense therapeutics that would block the 
translation of the transcript and protein production could also inhibit inflammatory 
processes. These types of therapeutics could be important in the treatment of diseases such 
30 as osteoarthritis. Likewise, these therapeutics could be important in the treatment of 

asthma, psoriasis, diabetes, and EBD, which require activated T cells, as well as diseases 
that involve B cell activation such as systemic lupus erythematosus. 
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Panel CNSjt Summary: Ag2100 This panel cS&^^i£Si)^^^^^^ 
at low levels in the brains of an independent group of individuals. Please see Panel L3D for 
a discussion of this gene in treatment of central nervous system disorders. 

B. CG117662-02: Renal renin precursor like, 

5 Expression of gene CGI 17662-02 was assessed using the primer-probe sets Ag2078 

and Ag5185, described in Tables BA and BB. Results of the RTQ-PCR runs are shown in 
Tables BC, BD, BE, BF and BG. 

Table BA. Probe Name Ag2078 

10 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 

No 


Forward 


5 * -accttcaaagtcgtctttgaca-3 1 


22 


292 


252 


Probe 


TET-5 ' -ctccaagtgcagccgtctctacactg 
-3 1 -TAMRA 


26 


342 


253 


Reverse 


5 1 -cgaagagcttgtgatacacaca-3 ' 


22 


370 


254 



Table BB. Probe Name Ae5185 

15 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -ccgtgtctgtggggtcat-3 1 


18 


491 


255 


Probe 


TET-5 » -attggtagacaccggtgcatcGtaca 
-3 1 -TAMRA 


26 


540 


256 


Reverse 


5 ' -tggagctggtagaacctgaga-3 1 


21 


566 


257 



Table BC.CNS neurodegeneration vl.O 

20 



Tissue Name 


Rel. 

Exp.(%) 
Ag5185, 
Run 

226559655 


issue Name 


Rel. 

Exp,(%) 
Ag5185, 
Run 

226559655 


AD 1 Hippo 


5.7 


Control (Path) 3 Temporal Ctx 


48.6 


AD 2 Hippo 


82.4 


Control (Path) 4 Temporal Ctx 


54.3 


AD 3 Hippo 


11.4 


AD 1 Occipital Ctx 


12.2 


AD 4 Hippo 


50.0 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


22.5 


AD 3 Occipital Ctx 


18.8 
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1^9 


AD40cipila?tx /UPOa/ ' 

fALJ wCCipilal \^LA 


|i:57i 

19.8 


Control 2 Hippo 


9.6 


JtWJ j L/CClpliai L*IX 


12.1 


Control 4 Hippo 


18.3 


A "P\ /Z Or>r»rrkiisil P^f-v 

al-j o \jccipnai v>la 


Zj.u 


Control (Path) 3 Hippo 


85.3 


L^onuroi i occipital i_tx 


ZO.Z 


AD 1 Temporal Ctx 


38.4 


v^ontroi z v^ccipiiai ux 


3.6 


AD 2 Temporal Ctx 


74.7 


v-ontroi j occipital ctx 


40.6 


AD 3 Temporal Ctx 


0.0 


control 4 occipiiai ctx 


OA f» 

20.9 


AD 4 Temporal Ctx |49.0 


control atnj i uccipitai ctx 


39.2 


AD 5 Inf Temporal Ctx 


31.6 


Control (ratii/ z uccipitai Ctx 


18.3 


AD 5 Sup Temporal Ctx 


36.3 


control (Jrainj :> uccipitai Ctx 


A A 

u.o 


AD 6 Inf Temporal Ctx ]55.5 


control (Jratn) 4 uccipitai ctx 


A A 

0.0 


AD 6 Sup Temporal Ctx j 


63.3 


PYvnfrnl 1 Pa-natal f"V-v 

comroi 1 1 aneiai ctx 


40./ 


Control 1 Temporal Ctx 


100.0 


Control 2 Parietal Ctx 


0.0 


Control 2 Temporal Ctx 


40.6 


Control 3 Parietal Ctx 


12.2 


Control 3 Temporal Ctx | 


47.0 


Control (Path) 1 Parietal Ctx 


65.5 


Control 3 Temporal Ctx 


24.7 


Control (Path) 2 Parietal Ctx 


23.8 


Control (Path) 1 Temporal Ctx 


50.7 j 


Control (Path) 3 Parietal Ctx 


0.0 


Control (Path) 2 Temporal Ctx \ 


65.5 | 


Control (Path) 4 Parietal Ctx 


57.4 



Table BP. General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5185, 
Run 

228757766 


issue Name 


Rel. 

Exp.(%) 
Ag5185, 
Run 

228757766 


Adipose 


1.0 


Renal ca.TK-10 


0.0 


Melanoma* Hs688(A).T 


0.2 


Bladder 


0.5 


Melanoma* Hs688(B).T 


0.1 


Gastric ca. (liver met.) NCI-N87 


1.1 


Melanoma* M14 


0.1 


Gastric ca. KATO m 


0.3 


Melanoma* LOXMVI 


0.1 


[Colon ca. SW-948 


18.2 


Melanoma* SK-MEL-5 


0.2 


Colon ca. SW480 


0.6 


Squamous cell carcinoma SCC-4 


0.4 


Colon ca* (SW480 met) SW620 


0.5 


Testis Pool 


8.4 


Colon ca.HT29 


1.6 


Prostate ca.* (bone met) PC-3 


1.5 


Colon ca.HCT-1 16 


0.5 


Prostate Pool 


0.6 


Colon ca. CaCo-2 


0.2 


Placenta 


3.0 


Colon cancer tissue 


2.6 


Uterus Pool 


1.5 


Colon ca.SWl 116 


0.1 


Ovarian ca. OVCAR-3 


0.9 


Colon ca. Colo-205 


0.0 


Ovarian ca. SK-OV-3 


0.2 


Colon ca.SW-48 


0.8 


Ovarian ca. OVCAR-4 


0.7 


Colon Pool 


4.7 


Ovarian ca. OVCAR-5 


4.7 


Small Intestine Pool 


4.0 


Ovarian ca. IGROV-1 


0.0 


Stomach Pool 


2.3 
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u van an ca. v i^/vk,-o 


U.Z 


Bone Marrow Fool 




Ovary 


o./ 


Fetal Heart 


0.2 


oreasi ca. MLr- / 


A fC 

U.J 


Heart Pool 


1.6 


jo re as t ca. ivjlu a-jvjud -z j) i 


A < 


Lymph JNode Jrool 


12.6 


r>reasi ca. r> x D4y 


a o 


Fetal Skeletal Muscle 


0.3 


joreastca. 


2.1 


skeletal Muscle Pool 


0.4 


jDreast ca. ml/a-in 


A A 

0.0 


opleen Pool 


0.1 


x>reast rOOI 


C A 

5.0 


Thymus Pool 


3.8 


l racnea 


1.0 


CNS cancer (gho/astro) U87-MG 


0.0 


Lung 


22.1 


CNS cancer (glio/astro) U-118-MG 


0.1 


Fetal Lung 


0.6 


CNS cancer (neuro;met) SK-N-AS 


0.0 


JLung ca. JNC1-JN417 


0.4 


CNS cancer (astro) SF-539 


0.0 


jumg ca. luA-l 


0.3 


CNS cancer (astro) SNB-75 


0.0 


Lung ca. NCI-H146 


0.0 


CNS cancer (glio)SNB-19 


0.3 


Lung ca. 6HP-77 


0.1 j 


CNS cancer (glio) SF-295 


0.5 


L/Ung ca. Aj49 


0.0 


Brain (Amygdala) Pool 


0.4 


Lung ca. NCI-H526 


0-5 


Brain (cerebellum) 


0.5. 


Lung ca. NCI-H23 


1.4 jBrain (fetal) 


0.0 


Lung ca. NCI-H460 


2.0 


Brain (Hippocampus) Pool 


0.2 


T - - - _ _ T TVM"* /TO J 

Lung ca. HOP-62 j 


0.1 


Cerebral Cortex Pool 


0.3 


Lungca. NCI-H522 


0.6 


Brain (Substantia nigra) Pool 


0.3 


Liver j 


1.0 


Brain (Thalamus) Pool 


0.6 


*r~7 _ a _ 1 T" fc ^ 

petal JLiver 


1.0 


Brain (whole) 


0.8 


i-uvci v>o. ncpvjz. 


0.0 


Spinal Cord Pool 


n c 
U.5 


Kidney Pool 


4.2 


Adrenal Gland 


2.6 


Fetal Kidney 


100.0 


Pituitary gland Pool 


0.6 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.5 


Renal ca. A498 


D.0 


Thyroid (female) 


0.1 


Renal ca. ACHN 


0.2 


Pancreatic ca. CAPAN2 


0.2 


Renal ca.UO-31 


0.3 


Pancreas Pool < 


1.9 



Table BE. Panel 1.3D 



Tissue Name 


Rel. 

Exp.(%) 

g2078, 

Run 

16562668 
4 


Rel. 

Exp.(%) 
Ag2078, 
Run 

16562749 
6 


Rel. 

Exp.(%) 

Ag2078, 

Run 

1656781 

22 


Tissue Name 


Rel. 

Exp.(%) 
Ag2078, 
Run 

16562668 
4 


Rel. 

Exp.(%) 
Ag2078, 
Run 

16562749 
6 


Rel. 

Exp.(%) 
Ag2078, 
Run 

16567812 
2 


Liver 

adenocarcinoma 


0.0 


0.1 


0.1 


Kidney«(fetal) 


100.0 


100.0 


100.0 


Pancreas 


0.0 


0.0 


0.0 


Renal ca. 786-0 


0.0 


0.0 


0.0 


Pancreatic ca. 
CAP AN 2 


0.0 


0.0 


0.2 


Renal ca.A498 


0.0 


0.0 


0.1 



354 



WO 03/029424 



PCT/US02/31373 



Adrenal gland 


0.5 


0.5 


0.3 


Renal ca RXr 
393 


0.0 


BOg/ 
0.0 


a 1.37 j 

0.0 


jl iiyroiu 


a a 




A A 

u.o 


Renal ca. 
ACHN 


0.0 


0.0 


0.0 


Salivary gland 


0.0 


0.1 


0.0 


Renal ca. 
UO-31 


0.0 


0.0 


0.1 


Pituitary gland 


0.0 


0.2 


0.0 


Renal ca. 
TK-10 


0.0 


0.1 


0.0 


Brain (fetal) 


0.0 


0.0 


0.0 


Liver 


0.3 


0.3 


0.0 


Brain (whole) 


0.0 


0.0 


0.1 


T ivpr (fpfnW 


0 6 




a 


Brain 

(amygdala) 


0.1 


0.0 


0.0 


Liver ca. 

(hepatoblast) 

Het>G2 


0.0 


0.0 


0.0 


Brain 

(cerebellum) 


0.1 


0.0 


0.1 


Lung 


0.0 


0.0 


0.1 


Brain 

(hipp ocampu s) 


0.0 


0.3 


0.0 


Lung (fetal) 


0.1 


0.1 


0.0 


Brain 

(substantia 

nigra) 


0.0 


0.0 


0.1 


Lung ca. (small 
cell)LX-l 


O 0 

V/.V/ 


n ft 




Brain 
(thalamus) 


0.1 


0.0 


0,1 


Lung ca. (small 

v cii j in i^a-xio;? 


0.0 


0.0 


0.0 


Cerebral Cortex 


0.0 


0.0 


0.2 


Lung ca. (s.cell 
|var.)SHP-77 


0.0 


0.1 


0.0 


Spinal cord 


0.0 


0.0 


0.0 


Lung ca. (large 
cell)Na-H460 


0.0 


0.0 


0.0 


glio/astro 
U87-MG 


A A 

o.o 


A A 
0.0 


0.0 


Lung ca. 
(non-sm. cell) 
A549 


0.0 


0.0 


0.0 


glio/astro 
U-118-MG 


a A 
0.0 


A A 
0.0 


0.0 


Lung ca. 

(non-s.cell) 

NCI-H23 


0.0 


0.0 


0.0 


astrocytoma 
SW1783 


a a 


n a 
0.0 


A 1 

0.1 


Lung ca. 

(non-s.cell) 

HOP-62 


0.1 


0.0 


0.0 


neuro*; met 
SK-N-AS 


n a 
u.u 




A A 

J.O 


Lung ca. 

[non-s.cl) 

NCI-H522 


o.o 


0.0 


o.o 


astrocytoma 
SF-539 


).0 


3.0 


10 

• 


Lung ca. 
[squam.) SW 
WO 


3.1 


3.1 


3.0 


astrocytoma 
SNB-75 


).0 


).0 


] 

3.2 ( 

I 


Lung ca. 

squam.) ( 
VTCI-H596 


3.0 


3.0 ( 


3.0 


glioma SNB-19 


).0 ( 


).0 ( 


).0 1 

i 


Vlammary ^ 
jland 


).2 ( 


3.2 ( 


).l 


gliomaU25l ( 


).0 ( 


).0 C 




3reast ca.* ( 
pl.ef)MCF-7 


).0 ( 


J.O ( 


U 
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glioma SF-295 


0.0 


0.0 


0.0 


-rr — gfr 

JBreast ca.* 

(pl.ef) 

|mDA-MB-231 


0.1 


SOB!/ 
0.0 


0.0 


•ft 


Heart (fetal) 


0.0 


0.0 


0.0 


Breast ca * 
(pl.ef)T47D 


0.1 


0.0 


0.0 




Heart 


0.0 


0.0 


0.0 


Breast ca. 
BT-549 


0.0 


0.0 


0.0 




Skeletal muscle 
(fetal) 


0.0 


0.0 


0.0 


Breast ca. 
MDA-N 


0.0 


0.0 


0.0 




Skeletal muscle 


0.0 


0.0 


0.0 


Ovary 


0.6 


0.8 


0.6 




Bone marrow 


0.0 


0.0 


0.0 


Ovarian ca. 
OVCAR-3 


0.1 


0.1 


0.0 




Thymus 


0.0 


0.0 


0.0 


Ovarian ca, 
OVCAR-4 


0.0 


0.1 


0.0 


Spleen 


0.0 


0.0 


0.0 


Ovarian ca. 
OVCAR-5 


0.2 


0.2 


0.1 




Lymph node 


0.0 


0.1 


0.0 


Ovarian ca. 
OVCAR-8 


0.0 


0.0 


0.0 




Colorectal 


0.0 


0.0 


0.0 


Ovarian ca 
IGROV-1 


0.0 


0.0 


0.0 


Stomach 


0.0 


0.0 


0.1 


Ovarian ca.* 
SK-OV-3 


0.0 


00 


0 0 


Small intestine 


0.1 


0.0 


0.0 


Uterus 


1 7 


1 1 

X . A 


1 1 

1.1 


Colon ca. 
SW480 


0.0 


0.0 


0.0 


Placenta 


0.7 


1.2 


0.7 


Colon ca.* 

SW620(SW480 

met) 


0.0 


0.0 


0.0 


Prostate 


0 1 


00 

V/.v 


0 1 


Colon ca. HT29 


0.2 


0.3 


0.3 


Prostate ca.* 
fbone metYPC-3 


0.2 


0.2 


0.0 


Colon ca. 
HCT-116 


0.0 


0.0 


0.0 


Testis 


0.2 


0.1 


0.2 


Colon ca. 
CaCo-2 


0.0 


0.0 


0.0 


Melanoma 
Hs688(A).T 


0.0 


0.0 


0.0 


tissue(0DO386 
6) 


0.2 • 


0.1 


0.5 


Melanoma* 
(met) 

Hs688(B).T 


0.0 


0.0 


0.0 


Colon ca. 
HCC-2998 


0.1 


0.3 




Melanoma 
UACC-62 


3.0 


0.0 


0.0 


(liver met) 
NCI-N87 


3.0 


3.1 


3.0 


Melanoma M14 


3.0 


3.0 


3.0 


Bladder 


10 


3.0 


,0 | 


Melanoma . 
1X3X LMVI 1 


3.0 


3.0 ( 


3.1 


Trachea 


).l 


3.0 ( 


I 

).0 ( 
< 


Melanoma* 
met) ( 
5K-MEL-5 


).0 ( 


3.0 ( 


).0 


Kidney ] 


Ll.2 : 


10.8 I 


17 i 


Adipose ( 


).0 ( 


).2 ( 


).0 
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Table BF. Panel 4D 



Tissue Name 


Rel. 

xp.(%) 

Ag2078, 

Run 


Tissue Name 


Rel. 

Exp.(%) 
Ag2078, 
Run 

16190S846 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


O.O 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + EL4 


O.O. 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.2 


rnniary ini aci 


u.u 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.2 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


0.3 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


ft 1 

U.I 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
ILlbeta 


O 1 


Primary Th2 rest 


0.0 


Small airway epithelium none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ IL-lbeta 


05 


CD45RA CD4 lymphocyte act 


0.8 


Coronery artery SMC rest 


0.1 


CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 
DL-lbeta 


n 1 


CD8 lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CD8 lymphocyte rest 


0.0 


Astrocytes TNFalpha + EL-lbeta 


b.i 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.O 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


0.0 


zxy i in/ 1 n/j i ri anu-v^uio 
CH11 


0.0 


CCD1 106 (Keratinocytes) none 


0.0 


LAK cells rest 


0.0 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


0.O 


LAK cells IL-2 


o.o 


Liver cirrhosis 


0.4 


LAK cells IL-2+IL-12 


).0 


Lupus kidney 


3.9 


LAK cells IL-2+EFN gamma 


3.0 


STCI-H292 none 


1.3 


LAK cells IL-2+ IL-18 


3.0 ] 


^CI-H292 IL^I 


).5 


LAK cells PMA/ionomycin i 


3.1 ] 


^CI-H292 EL-9 


1.9 


NK Cells IL-2 rest ( 


3.0 ] 


^CI-H292IL-13 ( 


).3 


Two Way MLR 3 day ( 


).0 ] 


^CI-H292 IFN gamma 3 


1.0 


Two Way MLR 5 day ( 


).0 ] 


iPAECnone ( 


).0 


TwoWayMLR7day ( 


).0 I 


IPAEC TNF alpha + IL-1 beta ( 


).0 
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PBMC rest 


0.1 jLungfibroblSfrfo^ UM2 ^ 




PBMCPWM 


0 0 


Lung fibroblast TNF alpha + IL-1 
beta 


0.0 


PBMC PHA-L 


0.0 


Lung fibroblast IL-4 


0.0 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IL-13 


0.0 


B lymphocytes PWM 


0.0 


Lung fibroblast IFN gamma 


0.0 1 


B lymphocytes CD40L and IL-4 


0.0 


Dermal fibroblast CCD1070 rest 


5.9 


EOL-1 dbcAMP 


n ft 


Dermal fibroblast CCD1070 TNF 
alpha 


4.5 


EOL-1 dbcAMP 

PA/T A /irvn r\m x/fin 


0.2 


Deimal fibroblast CCD1070 IL-1 
beta 


3.1 


X^&liUiiUC VCJJo llVJllC 


0.0 


Dermal fibroblast IFN gamma 


0.0 


JL/CIJUllLlv ecus JL»r O . 


n a 
u.u ■ 


uermal nbroDlast IL-4 j 


0.0 


Dendritic cells anti-CD40 


0.0 


EBD Colitis 2 


00 


Monocytes rest 


0.0 


1BD Crohn's 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


0.2 


Macrophages LPS 


0.0 


Thymus 


100.0 


HUVEC none 


0.4 


Kidney 


0.4 


HUVEC starved 


0.2 







Table BG. Panel SD 



Tissue Name 


Rel. 

Ex.(%) 

Ag2078, 

Run 

168095527 


Tissue Name 


Rel. 

Exp.(%) 
Ag2078, 
Run 

168095527 


97457 J?atient-02go_adipose 


11.7 


94709JDonor 2 AM - A_adipose 


0.0 


97476JPatient-07sk_skeIetal 
muscle 


0.0 


94710_Donor 2 AM - B_adipose 


0.0 ' 


97477_Patient-07ut_uterus 


2.8 


9471 l_Donor 2 AM - C_adipose 


0.0 


97478_Patient-07pl_pIacenta 


12.9 


94712_Donor 2 AD - A_adipose 


1.0 


97481J?atient-08sk_skeletal 
muscle 


0.0 


94713_Donor 2 AD - B.adipose 


0.0 


97482 _JPatient-08ut_uterus 


22.8 


94714JDonor 2 AD - C adipose 


0.0 


97483_Patient-08pl_placenta 


4.5 


94742_Donor 3 U - AJMesenchymal 
Stem Cells 


0.0 


97486JPatient-09sk_skeletal 
muscle 


0.0 


94743 JDonor 3 U - BJtfesenchymal 
Stem Cells 


0.0 


97487 J'atient-Ogut.uterus 


0.0 


94730JDonor 3 AM - A_adipose 


0.9 


97488_Patient-09pl_placenta 


2.7 


94731_Donor 3 AM - B_adipose 


0.0 


97492_Patient-10ut_uteras 


100.0 


94732_Donor3AM-C_adipose |0.0 
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97493 JPatient-lOpLplacenta 


5.4 j94733_Don§r 5T\E - aH&ipos? / " 




97495 ^atienM lgo__adipose 


6.0 J94734 J)onor 3 AD - B_adipose 


0.0 


97496 JPatienM lsk_skeletal 
muscle 


0.0 


94735_Donor 3 AD - C_adipose 


n n 


97497J > atient-l luMiterus 


12.8 


77 1 38JLiver _HepG2untreated 


0.0 


97498 Patient-Hoi Dlacenta 


8.5 


/ j j Do_xiQaj\_ 1^41 uiac stromal ceiis 
(primary) 




97500JPatient-12go_adipose 


o /.I 


o i / o j_onian miesune 


0.0 


9750LPatienM2sleskeletal 
muscle 


0.0 


72409 JKjdneyJProximal Convoluted 
Tubule 


0.0 


97502„Patient-12ut_uterus 


4.6 


82685_Small intestineJDuodenum 


0.0 


97503 Patient- 1?nl nlarpnta 


8.0 


90650_Adrenal_Adrenocortical 
adenoma 


U.U 


94721_Donor2U- 
A_MesenchymaI Stem Cells 


0.0 


72410JCidneyJHRCE 


i.i 


94722_Donor2U- 
BJMesenchymal Stem Cells 


0.0 


72411JCidneyJIRE 


5.3 


94723 JDonor2U- 
^Mesenchymal Stem Cells 


0.0 


73 139_Uterus_Uterine smooth j 
muscle cells j 


2.4 



CNS_neurodegeneration_vl*0 Summary: Ag5185 Low levels of expression of 
this gene is seen in control temporal cortex and in a hippocampus sample from an 
Alzheimer patient (CTs=34.6-34.9). Therefore, therapeutic modulation of this gene may be 
useful in the neurological disorders including seizure and memory related diseases. 

General_screening_jpaneLvl.5 Summary : Ag5185 Highest expression of this 
gene is detected in fetal kidney (CT=26.7). Interestingly, expression of this gene is higher 
in fetal as compared to adult kidney (CT=31). This observation suggests that expression of 
this gene can be used to distinguish fetal from adult kidney and also from other samples in 
this panel. In addition, the relative overexpression of this gene in fetal tissue suggests that 
the protein product may enhance kidney growth or development in the fetus and thus may 
also act in a regenerative capacity in the adult. Therefore, therapeutic modulation of the 
protein encoded by this gene could be useful in treatment of kidney related diseases 
including lupus and glomerulonephritis. 

Moderate to low levels of expression of this gene is also seen in tissues with 
metabolicfendocrine functions such as pancreas, adiposes, adrenal and pituitary glands, 
heart, skeletal muscle, and gastrointestinal tract. Therefore, therapeutic modulation of the 
activity of this gene may prove useful in the treatment of endocrine/metabolically related 
diseases, such as obesity and diabetes. 
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Moderate to low levels of expression of this gene u Is%lso SeeYnf^nunfbef ^fcsmcef^ 
cell lines derived from colon, lung, and ovarian cancer. Therefore, therapeutic modulation 
of this gene may be useful in the treatment of colon, lung and ovarian cancers. 

Panel 1.3D Summary: Ag2078 Three experiments with same probe-primer sets 
are in excellent agreement. Highest expression of this gene is seen in fetal kidney 
(CTs=26-27.8), with lower expression in the adult lung. This pattern correlates to the 
expression seen in panel 1 .5. Please see panel 1 .5 for further discussion of this gene. 

Panel 4D Summary: Ag2078 Highest expression of this gene is detected in 
thymus (CT=27.3). This gene or its protein product may thus play an important role in T 
cell development. Small molecule therapeutics, or antibody therapeutics designed against 
the protein encoded for by this gene could be utilized to modulate immune function (T cell 
development) and be important for organ transplant, ADDS treatment or post chemotherapy 
immune reconstitiution. 

Moderate to low levels of expression of this gene is also seen in lupus kidney, 
resting and cytokine activated mucoepidermoid NCI-H292 cells and dermal fibroblasts. 
Therefore, therapeutic modulation of this gene may be useful in the treatment of chronic 
obstructive pulmonary disease, asthma, allergy, emphysema, lupus kidney and skin 
disorders, including psoriasis. , 

Panel 5D Summary: Ag2078 Highest expression of this gene is detected in uterus 
and adipose of diabetic patients on insulin (CT=30.9-31). In addition, moderate to low 
levels of expression of this gene is also seen in uterus and placenta. Therefore, therapeutic 
modulation of this gene may be useful in the treatment of obesity and diabetes. 

C. CG118051-02: ALDH8 splice variant, submitted to study 
DDSMT on 09/26/01 by saguo; classification type=Finished In-silico; 
novelty=Update-Variants; ORF start=407, ORF stop=1436, frame=2; 
1586 bp. 

Expression of gene CGI 18051-02 was assessed using the primer-probe set Ag3729, 
described in Table CA. Results of the RTQ-PCR runs are shown in Tables CB and CC. 
Table CA. Probe Name Ag3729 



Primers 




Length 


Start 
Position 


SEQID 
No 
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Forward 


5 1 -ttcaagaaaacaagcagcttct-3 1 






Probe 


TET-5 ' -cccaggacctgcataagccagct-3 
1 -TAMRA 


23 |309 


259 


Reverse 


5 1 -ctcagatatgtctgcctcgaa-3 ' 


21 j332 


260 



Table CB. Panel 2.2 



Tissue Name 


ReL 

Exp.(%) 
Ag3729, 
Run 

174441818 


ReL 

Exp.(%) 
Ag3729, 
Run 

259034396 


Tissue Name 


Rel. 

Exo(%) 
Ag3729, 
Run 

174441818 


Rel. 

Rxn ( %"> 
Ag3729, 
Run 

259034396 


Normal Colon 


U.4 


0.3 


Kidney Margin 
(OD04348) 


0.0 


0.0 


Colon cancer 
(OD06064) 


1.4 


1.0 


Kidney malignant 

cancer 

(OD06204B) 


0.0 


0.0 


Colon Margin 
(OD06064) 


0.0 


0.0 


Kidney normal 
adjacent tissue 
(OD06204E) 


0.0 


0.0 


Colon cancer 
(OD06159) 


0.2 


0.1 


Kidney Cancer 
(OD04450-O1) 


0.0 


0.O 


Colon Margin 
(OD06159) 


0.0 


0.0 


Kidney Margin 
(OD04450-03) 


1.3 


0.9 


Colon cancer 
(OD06297-04) 


0.0 


0.0 


Kidney Cancer 
8120613 


0.0 


00 

wiV 


Colon Margin 
(OD06297-05) 


0.0 


0.0 


Kidney Margin 
8120614 


0.0 


0.0 


CC Gr.2 ascend colon 
iJUUjyZl) 


1.1 


0.8 


Kidney Cancer 
9010320 


0.5 


0.3 


CC Margin 
(OD03921) 


0.0 


0.0 j 


Kidney Margin 
9010321 


1.8 


1.4 


Colon cancer 

metastasis 

(OD06104) 


0.2 


0.1 


Kidney Cancer 
8120607 


0.0 


0.0 


Lung Margin 
(OD06104) 


0.0 


0.0 


Kidney Margin 
8120608 


1.0 


0.8 


Colon mets to lung 
(OD04451-01) 


0.2 


0.2 


Normal Uterus 


0.0 


0.0 


Lung Margin 
(OD04451-02) 


0.0 


0.0 


Uterine Cancer 
064011 


1.8 


1.2 


Normal Prostate 


2.3 


1.8 


Normal Thyroid 


0.0 


0.0 


Prostate Cancer 
(OD04410) 


2.2 


1.6 


rhyroid Cancer 
064010 


3.0 


10 


Prostate Margin 
(OD04410) 


5.1 


3.8 


rhyroid Cancer 
&302152 


3.0 


10 
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Normal Ovary 


0.7 


0.3 


-a : B-^jbrif- 

Thvroid IVTar cnn 

A302153 


- " uoue: 
0.0 


0.0 


Ovarian cancer 
(OD06283-03) 


2.5 


1.7 


Normal Breast 


9.2 


6.5 


Ovarian Margin 
(OD06283-O7) 


0.0 


0.0 


Breast Cancer 
(OD04566) 


17.4 


12.9 


Ovarian Cancer 
064008 


1.0 


0.6 


Breast Cancer 1024 


100.0 


100.0 


Ovarian cancer 
(OD06145) 


0.4 


0.3 


Breast Cancer 


3.9 


2.5 


Ovarian Margin 
(OD06145) 


0.5 


0.3 


Breast Cancer Mets 
(0004^00-031 


1.2 


0.9 


Ovarian cancer 
(OD06455-03) 


0.9 


0.5 


Breast Cancer 
Metastasis 


48.6 


34.4 


Ovarian Margin 
(OD06455-07) 


0.0 


0.0 


Breast Cancer 
064006 


2.4 


2.1 


Normal Lung 


0.0 


0.0 


Breast Cancer 
9100266 


55.1 


43.8 


Invasive poordiff. 
lung adeno 
(ODO4945-01 


9.2 


7.5 


Breast Margin 
9100265 


id 7 


l\J.<y 


Lung Margin 
(ODO4945-03) 


0.0 


0.0 


Breast Cancer 
A90Q073 


32.1 


24.5 


Lung Malignant 
Cancer (OD03126) 


0.5 


0.4 


Breast Margin 


9.1 


6.4 


Lung Margin 
(OD03126) 


0.4 


0.3 


Breast cancer 

\Vl/vUUOJ j 


69.7 


61.6 


Lung Cancer 
(OD05014A) 


0.0 


0.0 


Breast cancer node 

JUL 1C loo Lcloi j 

(OD06083) 


98 S 


9*1 ^ 


Lung Margin 
(OD05014B) 


0.8 


0.6 


Normal Liver 


0.0 


0.0 


Lung cancer 
(OD06081) 


44.8 


0.3 


Liver Cancer 1026 


0.0 


0.0 


Lung Margin 
(OD06081) 


0.0 


0.0 


Liver Cancer 1025 


0.8 


0.6 


Lung Cancer 
(OD04237-01) 


3.1 


2.6 


5004-T 


0.2 


0.1 


Lung Margin 
(OD04237-02) 


Q.4 


0.3 


' ivpr Ti^qiip 

l— fi V C-J. A lOO UG 

5004-N 


3.4 


3.3 


Ocular Melanoma 
Metastasis 


3.0 


3.0 


jver Cancer 
5005-T 


3.0 ( 


3.0 


Ocular Melanoma 
Margin (Liver) 


3.0 


3.0 ; 


Liver Tissue . 
5005-N 


3.0 ( 


3.0 


Melanoma Metastasis 


).0 ( 


,o ; 


jver Cancer , 
)64003 


3.0 ( 


).0 


Melanoma Margin f 
(Lung) 


).3 ( 


).2 


formal Bladder ( 


).0 ( 


).0 
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Normal Kidney 



Kidney Ca, Nuclear 
grade2(OD04338) 



Kidney Margin 
(OD04338) 



0.0 



1.5 



0.4 



0.0 
1.2 



0.3 



Bladder Cancer 
1023 



i pcwusaa 



Bladder Cancer 
A302173 



Normal Stomach 



3.2 



4.5 



0.0 



2.3 



3.2 



0.0 



Kidney Ca Nnclear 
grade 1/2 (OD04339) 



0.0 



0.0 



Gastric Cancer 
9060397 



0.5 



0.3 



Kidney Margin 
(OD04339) 



0.0 



0.0 



Stomach Margin 
9060396 



2.1 



1.4 



Kidney Ca, Clear cell 
type(OD04340) 



0.0 



0.0 



Gastric Cancer 
9060395 



2.5 



1.7 



Kidney Margin 
(OD04340) 



0.4 



Kidney Ca, Nuclear 
grade 3 (OD04348) 



0.0 



0.3 



Stomach Margin 
9060394 



1.8 



0.0 



Gastric Cancer 
064005 



0.0 



1.1 



0.0 



Table CC. Panel 4.1D 



Tissue Name 


Rel. 
Ep.(%) 
Ag3729, 
Run 

170222887 , 


Tissue Name 


Rel. 

Exp.(%) 
Ag3729, 
Run 

170222887 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC 1L-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
BLlbeta 


26.8 


Primary Th2 rest 


0.0 


Small airway epithelium none 


25.5 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ EL-lbeta 


46.7 


CD45RA CD4 lymphocyte act 


0.0 


Coronery artery SMC rest 


0.0 ' 


CD45RO CD4 lymphocyte act 


0.0 


Cbronery artery SMC TNFalpha + 
IL-lbeta 


0.0 


CD8 lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CD8 lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL-lbeta 


0.O 
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Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 




-* 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


0.0 




2ry Thl/Th2/Trl_anti-CD95 

nji 1 

CJrill 


0.0 


CCD1106 (Keratinocytes) none 


0.0 




LAK cells rest 


0.0 


CCD 1106 (Keratinocytes) 
TNFalpha + EL-lbeta 


6.7 




LAK cells JL-2 


n o 


Liver cirrhosis 


0.0 




LAK cells IL-2+IL-12 


old '" 


NCI-H292 none 


100.0 




LAK cells 17 TFN era mm a 


no 


NCI-H292IL^ 


55.9 




LAK cells IL-2+ IL-1 8 


0.0 


NCI-H292 IL-9 


82.9 




LAK cells JrMA/ionomycin 


0.0 


NCI-H292IL-13 


58.2 


NK Cells IL-2 rest 


0.0 


NCI-H292 IFN gamma 160.3 


Two Way MLR 3 day 


0.0 


HPAEC none |0.0 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + IL-1 beta 


[0.0 


rri -¥-r T X AT T> T J 

Two Way MLR 7 day 


0.0 


Lung fibroblast none 


0.0 ' 


PBMC rest 


0.0 


Lung fibroblast TNF alpha + IL-1 
beta 


n n 
U.U 


YVT1 X JTJ~% TW¥ r\ It 

PBMC PWM 


0.0 


Lung fibroblast EL-4 


0.0 


PBMC PHA-L 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) none 


7.4 


Lung fibroblast BL-13 


0.0 


Ramos (B cell) ionomycin 


3-1 B 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 rest 


0.0 


R lvmnhncvtes CTMOT and TT -4 


on 


Dermal fibroblast CCD 1070 TNF 
alpha 


0.0 


EOL-1 dbcAMP 


0.0 


Dermal fibroblast CCD 1070 EL-1 
beta 


0.0 


EOL-1 dbcAMP 

P AA A ft r\in rimvri n 

1 ±VM»J lUIlUlliy C1U 


0.0 


Dermal fibroblast IFN gamrna 


0.0 


DfnrfriHf* ppIIs nnne 


0 0 


Dermal fibroblast IL-4 


0.0 


Dendritic cell? T 


on 

v/.v/ 


Dermal Fibroblasts rest 


0.0 


T^AndTifi^ r***11c <inti_ f ' 1 

.L/ CI lull CCJlb ttJI LI V_/jL/H-W 


U.\J 


Neutrophils TNFa+LPS 


o.o 


Monocytes rest 


0.0 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.O 


Macrophages rest 


0.0 


Lung 


5.3 


Macrophages LPS 


D.O 


thymus 


7.8 


HUVEC none 


10 


Kidney 


2.6 


HUVEC starved 


3.0 







CNSjneurodegeneration_vl.O Summary: Ag3729 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 2.2 Summary: Ag3729 Two experiments with same probe-primer sets are in 
good agreement. Highest expression of this gene is seen in breast cancer (CTs=27-29). 
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Thus, expression of this gene could be used to differentiate tetMeiHhtf^SBfcafe^ ' 
samples and other samples on this panel. 

In addition, moderate expression of this gene is also seen in cancer samples derived 
from colon, breast, ovarian, lung, bladder, kidney and uterine cancers. Interestingly, 
expression of gene higher cancer compared to the corresponding normal adjacent tissue. 
Thus, expression of this gene may be used as diagnostic marker to detect the presence of 
colon, breast, ovarian, lung, bladder, kidney and uterine cancers and also, therapeutic 
modulation of the expression or function of this gene may be effective in the treatment of 
these cancers. 

Panel 4.1D Summary: Ag3729 Expression of this gene is restricted to a few 
samples, with highest expression is seen in untreated NCI-H292 cells (CT=31.4). The gene 
is also expressed in a cluster of treated and untreated samples derived from the NCI-H292 
cell line, a human airway epithelial cell line that produces mucins. Mucus overproduction is 
an important feature of bronchial asthma and chronic obstructive pulmonary disease 
samples. Interestingly, the transcript is also expressed at lower but still significant levels in 
small airway and bronchial epithelium treated with IL-1 beta and TNF-alpha and untreated 
small airway epithelium. The expression of the transcript in this mucoepidermoid cell line 
that is often used as a model for airway epithelium (NCI-H292 cells) suggests that this 
transcript may be important in the proliferation or activation of airway epithelium. 
Therefore, therapeutics designed with the protein encoded by the transcript may reduce or 
eliminate symptoms caused by inflammation in lung epithelia in chronic obstructive 
pulmonary disease, asthma, allergy, and emphysema. 

D. CG140468-02: SERINE7THREONINE-PROTEIN KINASE 
PAK1. 

Expression of gene CG140468-02 was assessed using the primer-probe set Ag7054, 
described in Table DA. Results of the RTQ-PCR runs are shown in Table DB. Please note 
that CG140468-02 represents a full-length physical clone. 

Table DA, Probe Name Ag7054 



Primers 


Sequence 


Length 


Start 
Position 


SEQ ID 

No 


Forward 


5 1 -ggtttgagaagattgccaagc-3 1 


21 


819 


261 
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Probe 


i — ■ ■ Bw 

TET-5 ' -cctcactccactgattgctgcagcta 
a-3 ' -TAMRA 


p.ivuq 

27 


iOD/ 
850 


313?] 
262 


3 


Reverse 


5 1 -ctggggtgagtgtggttttag-3 ' '[ 


21 


898 


|263 





Table DB. General screening panel vL6 



Tissue Name 


Rel. 

Exp.(%) 
Ag7054, 
Run 


issue Name 


Rel. 

Exp.(%) 
Ag7054, 
Run 

282273878 


AfltTWIF* 




Kenaj ca. iJv-IU 


10.7 


Melanoma* TTc6ftJMA\ T 


7.3 


Bladder 


9.0 


IVlclduOriJa XlSOoO^JJ J. 1 


6.6 


Gastric ca. (liver met.) NCI-N87 


30.6 


XYioidnurna iYii't 


13.3 


Gastric ca. KATO HI 


49.3 


MMannma* T OVTA/TVT 


21.6 


Colon ca. SW-948 


1 o 

7.8 


. lVlC/XdnUIIia oJ\.-xVJUElJL*0 


8.1 


Colon ca. SW480 


2.5 


oquajnous ceu carcinoma oLA^-^t- 


7.7 


Colon ca * (SW480 met) SW620 


11.8 


Tactic Pr»r\l 
1 CSU5 x OOl 


5.6 


Colon ca. HT29 


22.2 




3.3 


Colon ca. HCT-116 


119.1 


Tractate Pr\r\l 

A I Void IC Jl UU1 


8.0 


Colon ca. CaCo-2 


|34.o 


x latCUul 


9.5 


Colon cancer tissue 


[9.0 




2.4 


Colon ca.SW1116 


— — — 

[4.5 


Ovarian ra OVPAR-^ 


100.0 


Colon ca. Colo-205 


Nam 

[10.2 


Ovarian pa ^"FT-OV-"} 


16.4 


Colon ca. SW-48 


|8.0 


Ovarian ca OVPAR-4 


3.3 


Colon Pool 


In | 
ff.l 


Ovarian ca OVCA"R-S 

vtu.ia.JLi vat v/ v vrviv J 


35.1 jSmall Intestine Pool 


O CI 


Ovari an ra TOR OV- 1 


5.3 


Stomach Pool 


5.1 


W Vail ill 1 Ca- KJ V L-Alv-O 


8.4 


Bone Marrow Pool 


3.4 


Ovary 


5.1 


Fetal Heart 


1.5 


Breast ca. MCF-7 


2.2 


Heart Pool 


3.7 


Breast ca. MDA-MB-231 


11.8 


Lymph Node Pool 


8.3 


Breast ca. BT 549 


4.2 


Fetal Skeletal Muscle 


8.1 


Breast ca. T47D 


7.7 


Skeletal Muscle Pool 


4.3 


Breast ca. MDA-N 


5.8 


Spleen Pool 


5.1 


Breast Pool 


8.8 


Thymus Pool 


7.6 


Trachea 


7.7 


CNS cancer (glio/astro) U87-MG 


6.3 


Lung 


4.1 


CNS cancer (glio/astro) U-l 18-MG 


12.7 


Fetal Lung 


7.9 


CNS cancer (neuro;met) SK-N-AS 


5.2 


Lungca. NCI-N417 


7.9 - ( 


CNS cancer (astro) SF-539 


7.4 


Lung ca. LX-1 


19.9 I 


CNS cancer (astro) SNB-75 


14.1 


Lung ca. NCI-H146 


3.5 ( 


CNS cancer (glio) SNB-19 [ 


5.5 


Lung ca. SHP-77 


5.8 ( 


CNS cancer (glio) SF-295 i 


5.8 
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Lungca.A549 j8.8 


LJtcLlLi y^iiAllj QAala. J X UUI 


1 «»tt """ft '"*!? 

24. o 


Lungca. NCI-H526 


3.5 




o 


Lungca. NCI-H23 


11.0 


"Rrain ffptaH 

XJlalU ^lClal^ 




Lungca.NCI-H460 


1.0 




11 o 
ZA.Z 


Lungca.HOP-62 


3.5 






Lungca. NCI-H522 


20.7 


xJkalli \Ou\Jbidlllla. HlgTa/ JTUU1 


z/.y 


Liver 


0.7 


XJlalU \ X ildlaAuub J X UUI 


DI.5 


Fetal Liver 1 


9.1 


XJla.Hl \WIlUJC^ 




Liver ca. HepG2 


0.5 


Spinal Cord Pool 


5.0 


Kidney Pool 


11.3 


Adrenal Gland 


4.9 


Fetal Kidney 


16.0 


Pituitary gland Pool 


4.9 


Renal ca. 786-0 


9.9 


Salivary Gland 


2.7 


Renal ca. A498 


4.4 


Thyroid (female) 


5.8 


Renal ca. ACHN 


6.9 


Pancreatic ca. CAP AN2 


9.7 


Renal ca. UO-31 


13.5 


Pancreas Pool 


5.5 



General_screenin^paiieLvl.6 Summary: Ag7054 Highest expression of this 
gene is detected in a ovarian cancer cell line (CT=25.4). Moderate levels of expression of 
this gene is also seen in cluster of cancer cell lines derived from pancreatic, gastric, colon, 
lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain 
cancers. Thus, expression of this gene could be used as a marker to detect the presence of 
these cancers. Furthermore, therapeutic modulation of the expression or function of this 
gene may be effective in the treatment of pancreatic, gastric, colon, lung, liver, renal, 
breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at 
moderate levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal 
muscle, heart, liver and the gastrointestinal tract. Therefore, therapeutic modulation of the 
activity of this gene may prove useful in the treatment of endocrine/metabolically related 
diseases, such as obesity and diabetes. 

In addition, this gene is expressed at high levels in all regions of the central nervous 
system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 
product may be useful in the treatment of central nervous system disorders such as 
Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 
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Interestingly, this gene is expressed at much highlr ^^^irffeW^^S^J^li" 
compared to adult liver (CT=32.7). This observation suggests that expression of this gene 
can be used to distinguish fetal from adult liver. In addition, the relative overexpression of 
this gene in fetal tissue suggests that the protein product may enhance liver growth or 
5 development in the fetus and thus may also act in a regenerative capacity in the adult 
Therefore, therapeutic modulation of the protein encoded by this gene could be useful in 
treatment of liver related diseases. 

E, CG142564-01: CARNITINE 
O-PALMTOYLTRANSFERASE L 

10 Expression of gene CG142564-01 was assessed using the primer-probe set Ag6952, 

described in Table EA. Results of the RTQ-PCR runs are shown in Table EB. Please note 
that CG142564-02 represents a full-length physical clone. 

Table EA, Probe Name A&6952 

15 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -tctgctaccaatcccagatcc-3 ■ 


21 


434 


264 


Probe 


TET-5 1 -tcgacccagagcagcacccca-3 ' 
-TAMRA 


21 


461 


265 


Reverse 


5 ' -catctgctacagggccaaag~3 ■ 


W 


504 |266 



Table EB. General screening panel vl.6 

20 



Tissue Name 


Rel. 

Exp.(%) 
Ag6952, 
Run 

278388893 


issue Name 


Rel. 

Exp.(%) 
Ag6952, 
Run 

278388893 


Adipose 


4.1 


Renal ca.TK-10 


20.0 


Melanoma* Hs688(A).T 


0.8 


Bladder 


33.4 


Melanoma* Hs688(B).T 


1.2 


Gastric ca. (liver met) NCI-N87 


81.2 


Melanoma* M14 


21.8 


Gastric ca. KATO IH 


8.2 


Melanoma* LOXIMVI 


4.6 


Colon ca. SW-948 


5.4 


Melanoma* SK-MEL-5 


8.5 


Colon ca. SW480 


14.8 


Squamous cell carcinoma SCC-4 


1.6 


Colon ca * (SW480 met) SW620 


17.1 


Testis Pool 


31.6 


Colon ca. HT29 


1.3 


Prostate ca.* (bone met) PC-3 


9.3 


Colon ca.HCT-1 16 


14.3 • 
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a 1 UolaLC a \J\Jm 




IColonca C&4 T * - UQO:a 
i^oion ca. i^auo-z 


6.7 


Plappnta 


JO. J 


c^ojon cancer ussue 


f.D 


T ftpm*; Pnnl 


!r> 7 


v-*oion ca. o w 1 1 x o 


A A 

4.4 


Ovarian ca OVPAR-^ 


S ft 


v^oion ca. K*Ol O-ZU J 


4./ 


Ovarian ca 5vK"-OV-^ 




^oion ca. o w-^fo 


z.o 


Ovarian ra OVPAR-d 


1 0 


uoion r OOi 


J A 


Ovarian pa CWfAR-S 


Zj.J 


ouiaii lDiesune Jrooi 


z.y 


Ovarian pa TfTPOV-l 




Stomach Pool 


z.y 


Ovarian p» CWJCAX) fl 


A 7 


Bone Marrow Pool 


1.5 


II V 3 TV 

\jvaiy 




retai rieart 




RrpaQt rn MPT? 7 


Q 7 


xieart r ool 


42.6 


Rrpa^ca 1VTD A - A/TR -931 


0 1 


i-<ynipn iNOue Jrooi 


2.9 


"Rrpncf pq "RT ^AQ 

di cast ca. r> i j'f y 


14..5 


retai okeletal Muscle 


17.9 


DlCdSL Ca. I't/JL/ 


'J ^ 


dKeletaJ Muscle rool 


21.8 


Rrpacf pa A/TPi A M 
£>x Cai>l ca. AVJLL//V-JLN 


U.o 


Spleen Pool 


10.4 




^ i 


x ny mus Jtooj 


17.9 


i racnea 


"3 Q 


CJNk> cancer (glio/astro) Uo7-MCr 


|12.3 


Lung 


^ ri 

j.U 


uiN5 cancer (glio/astro) U-l JLo-Mur 


25.3 


reidi Lrung 


f.D 


CINo cancer (neuro;met) i>K-N-AS 


21.0 


T nnaro XT<^T NT/1 11 




CiNi> cancer (astro) SF-539 


2.6 


JLung ca. 


ZZ.o 


CNS cancer (astro) SNB-75 


16.5 


i_ung ca. i\i_JHtii40 


5.0 


CiNc> cancer (glio) SNB-19 


10.1 


T nncj psi ^PTP 11 
JL»Uiig Ca. OXlx -/ / 


Z0.4 


CJNo cancer (glio) br-Zyo 


61.1 


JL»ung Ca. Aj^7 




brain (Amygdala) Pool 


4.5 • 


idling ca. iNi^x-jrozo 


A 0 

U.o 


Brain (cerebellum) 


39.0 


T una pa ISJPT.W?^ 


no 

1 J.5 | 


tsrain (letai^ 


13.2 


T una ra TSJPT TT/lftft 
JL.Uug ca. IN^l-JlffOU 


no 


Brain (Hippocampus) Pool 


3.6 


T imam HOP £7 
i-iling Ca. lavyr-OZ 


jZ.o 


cereoTal Cortex rool 


3.4 


t ii nor ra Tan w^oo 
idling ca. rN^i-iiDzz 


Zl.O 


Brain (Substantia nigra) Pool 


5.3 


Liver 


U.4 


cram (l nalamus) Pool 


5.6 


T7^iol T ii/ar 
Fclal CdVCr 


z.z 


orain (wnole) 


3.3 


Liver ca. HepG2 


5.0 


Qninal OnrH Pnrtl 




Kidney Pool 


2.7 


Adrenal Gland 


6.9 


Fetal Kidney 


4.6 


Pituitary gland Pool 


3.2 


Renal ca. 786-0 


14.6 


Salivary Gland 


\.9 


Renal ca. A498 | 


1.8 


rhyroid (female) 


1.1 


Renal ca. ACHN 


7.6 3 


Pancreatic ca. CAPAN2 


12.1 


Renal ca. UO-31 


11.9 1 


Pancreas Pool 


5.0 



General_screening_panel_vl.6 Sumniary: Ag6952 Highest expression of this 
gene is detected in fetal heart (CT=26.7). Moderate to high levels of expression of this gene 
is also seen in tissues with metabolic/endocrine functions such as pancreas, adipose, 
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adrenal gland, thyroid, pituitary gland, skeletal muscle, hdart'livef aMtfig*' J gSstfoin l teStiflM 1 ' ■=* 
tract. Therefore, therapeutic modulation of the activity of this gene may prove useful in the 
treatment of endocrine/metabolically related diseases, such as obesity and diabetes. 

Moderate levels of expression of this gene is also seen in cluster of cancer cell lines 
derived from pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, 
squamous cell carcinoma, melanoma and brain cancers. Thus, expression of this gene could 
be used as a marker to detect the presence of these cancers. Furthermore, therapeutic 
modulation of the expression or function of this gene may be effective in the treatment of 
pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, squamous cell 
carcinoma, melanoma and brain cancers. 

In addition, this gene is expressed at moderate levels in all regions of the central 
nervous system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 
product may be useful in the treatment of central nervous system disorders such as 
Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

F. CG142797-01: Cathepsin L like. 

Expression of gene CG142797-01 was assessed using the primer-probe set Ag7539, 
described in Table FA. 

Table FA. Probe Name Ag7539 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -ctctaacacgtgaccacagtctaga-3 1 


25 


68 


267 


Probe 


TET-5 ' -tcttgtgctttgccttccacttggt- 
3 * -TAMRA 


25 


103 


268 


Reverse 


5 1 -atcttcatgttctccatgtcatataatc-3 
• 


28 


128 


269 



CNSjneurodegeneration_vl.O Summary: Ag7539 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

Panel 4.1D Summary: Ag7539 Expression of this gene is low/undetectable (CTs 
> 35) across all of the samples on this panel. 
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G. CG143216-01: Diacylglycerol Kinase 

Expression of gene CG143216-01 was assessed using the primer-probe sets Ag4554 
and Ag7230, described in Tables GA and GB. Results of the RTQ-PCR runs are shown in 
Tables GC, GD, GE and GR 

Table GA. Probe Name Ag4554 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -aatgctccaggttcaattttct-3 1 


22 


1349 


270 


Probe 


TET-5 ' -accaaccagcaggaccagtttgactt 
-3 ' -TAMRA 


26 


1390 


271 


Reverse 


5 1 -gacgcgataaacttcaacaaaa-3 ' 


22 


1419 


272 



Table GB. Probe Name Ag7230 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -gcatatcgttgttggggact-3 1 


20 


852 


273 


Probe 


TET-5 1 -atggatgtgtcctcagtccaccacaa 
-3 1 -TAMRA 


26 


880 


274 


Reverse 


5 ' -cacggagtagcgaaggagtg-3 ' 


20 


911 


275 



Table GC. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag4554, 
Run 

224721290 


Rel. 

Exp.(%) 
Ag7230, 
Run 

288742189 


issue Name 


ReL 

Exp.(%) 
Ag4554, 
Run 

224721290 


Rel. 

Exp.(%) 
Ag7230, 
Run 

288742189 


AD 1 Hippo 


9.3 


14.1 


Control (Path) 3 
Temporal Ctx 


5.7 


5.3 


AD 2 Hippo 


22.2 


20.2 


Control (Path) 4 
Temporal Ctx 


20.0 


19.2 


AD 3 Hippo 


10.6 


9.7 


AD 1 Occipital Ctx 


7.3 


18.6 


AD 4 Hippo 


7-1 


5.3 


AD 2 Occipital Ctx 
(Missing) 


0.0 


0.0 


AD 5 hippo 


100.0 


100.0 


AD 3 Occipital Ctx 


11.3 


8.0 


AD 6 Hippo 


36.9 


42.0 


AD 4 Occipital Ctx 


19.8 


13.4 


Control 2 Hippo 


22.7 


23.8 


AD 5 Occipital Ctx 


15.9 


18.0 


Control 4 Hippo 


7.7 


10.2 


AD 6 Occipital Ctx 


53.2 


54.3 
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Control (Path) 3 
Hippo 


6.9 


5.2 


Control lbcMaf feV"*-' 

ctx r 


««1L ««,it „, 

3.9 


AD 1 Temporal Ctx 


15.7 


18.2 


control z uccipicai 
Ctx 


81.8 


90.8 


AD 2 Temporal Ctx 


20.2 


20.0 


control o occipital 
Ctx 


14.4 


14.7 


AD 3 Temporal Ctx 


9.9 


8.0 


Control 4 Occipital 
Ctx 


6.4 


6.8 


AD 4 Temporal Ctx 


18.8 


9.8 


Control (rath) 1 
Occipital Ctx 


45.4 


57.8 


AD 5 Inf Temooral 
Ctx 


97.9 


81.2 


Control (rath) 2 
Occipital Ctx 


6.1 


6.1 


AD 5 SupTemporal 
Ctx 


31.6 


36.3 


Control (Path) 3 
Occipital Ctx 


5.1 


5.2 


AD 6 Inf Temporal 
Ctx 


26.2 


28.9 


Control (Path) 4 
Occipital Ctx 


12.6 


12.8 


AD 6 Sun Temporal 
Ctx 


29.1 


33.7 


Control 1 Panetal 
Ctx 


6.4 


5.7 


Control 1 Temnoral 
Ctx 


9.5 


5.1 


Control 2 Panetal 
Ctx 


26.4 


26.4 


Control 2 Teirrnoral 
Ctx 


39.0 


43.2 


("Tvntml ^ Parietal 

Ctx 


18.0 


19.6 


Control 3 Temporal 
Ctx 


10.1 


11.4 


Control (Path) 1 
Parietal Ctx 


56.3 


70.7 


Control 4 Temporal 
Ctx 


6.6 


6.7 


Control (Path) 2 
Parietal Ctx 


15.7 


15.2 


Control (Path) 1 
Temporal Ctx 


32.8 


35.1 


Control (Path) 3 
Parietal Ctx 


5.5 


5.1 


Control (Path) 2 
Temporal Ctx 


20.4 


22.8 


Control (Path) 4 
Parietal Ctx 


4L5 


36.3 



Table GD. General screening panel vl.4 



Tissue Name 


Rel. 

Exp.(%) 
Ag4554, 
Run 

222809973 


issue Name 


Rel. 

Exp.(%) 
Ag4554, 
Run 

222809973 


Adipose 


5.4 


Renal ca.TK-10 


34.6 


Melanoma* Hs688(A).T 


45.1 


Bladder 


15.8 


Melanoma* Hs688(B).T 


45.1 


Gastric ca. (liver met) NCI-N87 


21.3 


Melanoma* M14 


85.9 


Gastric ca. KATO III 


84.1 


Melanoma* LOXIMVI 


21.9 


Colon ca. SW-948 


0.7 


Melanoma* SK-MEL-5 


69.7 j 


Colon ca. SW480 


52.5 


Squamous cell carcinoma SCC-4 |26.8 


Colon ca * (SW480 met) SW620 


27.0 


Testis Pool |6.8 | 


Colon ca.HT29 


12.5 
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-rrosiaie ca. ^oone met/ rt-j 




Colon ca. Jid-iio 


^ ,M °ft .rfi — «n — n* • 

72.7 


3 


r lUMaic iruoj 


A Q 


colon ca. caco-/ 


Off c 

25.5 






D. I 


Colon cancer tissue 


24.1 


u cerus r poi 




colon ca. o W J. 1 10 


o ff 

8.5 


Ovarian r»a fYV/T^AT? 3 


1 A Q 


colon ca. coio-zud 


12.9 




Ovarian r»a QT/" f"\\7 Q 

kjvanan ca. oxv-u vo 


1 AA A 


Colon ca. oW-4o 


6.5 




uvanan ca. lfvc/VK-^- 


1 A O 


colon rool 


15.0 




Ovarian ca. OVCAR-5 


36.1 


omall Intestine Pool 


17.8 




Ovarian ca. IGROV-1 


20.3 


Momaen Pool 


9.0 




Ovarian ca. OVCAR-8 


16.0 


Bone Marrow Pool 


ff rv 

5.0 




Ovary 


15.0 


Fetal Heart 


23.5 


Breast ca. MCF-7 


16.5 


Heart Pool 


12.2 




Breast ca. MDA-MB-231 


51.1 


Lymph Node Pool 


I5.l 


Breast ca. BT 549 {47.3 


Fetal Skeletal Muscle 


4.6 


Breast ca.T47D |62.0 


Skeletal Muscle Pool 


12.0 


Breast ca. MDA-N jl7.8 jSpIeenPool 


10,7 


Breast Pool jl2.5 


Thymus Pool 


26.2 


Trachea |l2.3 


CNS cancer (glio/astro) U87-MG 


65.1 


Lung jl.2 


CNS cancer (glio/astro) U-118-MG 


79.0 


Fetal Lung |27.4 


CNS cancer (neuro^net) SK-N-AS 


48.6 


Lungca.NCI-N417 8.0 


CNS cancer (astro) SF-539 


23.3 


Lungca.LX-1 152.1 


CNS cancer (astro) SNB-75 


89.5 


Lung ca. NCI-H146 j22.5 


CNS cancer (glio) SNB-19 


21.8 


Lungca.SHP-77 97.9 


CNS cancer (glio) SF-295 


63.7 


Lungca.A549 J25.0 


Brain (Amygdala) Pool 


14.8 


Lungca.NCI-H526 I 


0.0 


Brain (cerebellum) 


90.8 


Lungca.Na-H23 


45.1 


Brain (fetal) 


30.4 


Lungca.NQ-H460 


15.9 


Brain (Hippocampus) Pool 


15.0 


Lungca.HOP-62 [27.4 


Cerebral Cortex Pool 


29.3 


Lungca. NCI-H522 


27.9 


Brain (Substantia nigra) Pool 


31.2 


Liver 


3.7 


Brain (Thalamus) Pool 


27.7 


Fetal Liver 


12.0 


Brain (whole) 


29.3 


Liver ca. HepG2 


28.1 


Spinal Cord Pool 


11.8 


Kidney Pool 


25.0 


Adrenal Gland 


29.1 


Fetal Kidney 


13.7 


Pituitary gland Pool 


24.8 


Renal ca. 786-0 


24.0 


Salivary Gland 


11.6 


Renal ca. A498 


i.5 


rhyroid (female) 


ii.5 


Renal ca. ACHN 


5.3 ] 


Pancreatic ca. CAPAN2 


L0.4 


Renal ca.UO-31 


18.8 


Pancreas Pool \ 


M.8 
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Table GE. Panel 4-1D 



Tissue Name 


Rel. 

Exp.(%) 
Ag4554, 
Run 

199319739 


Rel. 

Exp.(%) 
Ag7230, 
Run 

288211134 


Tissue Name 


Rel. 

Exp.(%) 
Ag4554, 
Run 

199319739 


Jvel. 

Exp.(%) 
Ag7230, 
Run 

288211134 


Secondary Thl act 


70.2 


48.3 


HUVEC IL-lbeta 


62.9 


38.4 


Secondary Th2 act 


44.8 


30.4 jHUVEC IFN gamma 


50.3 


35.1 


Secondary Trl act 


64.2 


17.8 


HUVEC TNF alpha H 
IFN gamma 


h 18.2 


14.0 


Secondary Thl rest 


17.7 


6.7 


HUVEC TNF alpha -\ 
TLA 


43.2 


13.1 


Secondary Th2 rest 


22.4 


6.6 


HUVEC PL-11 




10./ 


Secondary Trl rest 


17.0 


6.0 


Lung Microvascular 
EC none 


100.0 


100.0 


Primary Thl act 


27.7 


6.0 


Lung Microvascular 
EC TNFalpha -f 

TT 1 

IL-lbeta 


89 A 
5Z.4 


/JO r\ 

42.0 


Primary Th2 act 


42.3 


24.8 


Microvascular 
Dermal EC none 


40.3 


9.7 


Primary Trl act 


39.5 




Microsvasular 
Jjermai rsu 
TNFalpha + IL-lbeta 


/.O.J 


7.1 


Primary Thl rest 


17.2 


12.2 


Bronchial epithelium 
TNFalpha + ILlbeta 


17.7 


5.6 


Primary Th2 rest 


11.0 


10.1 


Small airway 
epithelium none 


4.5 


3.6 


Primary Trl rest 


39.2 


1.2 


Small airway 
epithelium TNFalpha 
+ IL-lbeta j 


11.4 


6.6 


CD45RA CD4 
lymphocyte act 


'XO ft 


18.7 


Coronery artery SMC 
rest 


24.8 


14.1 


CD45RO CD4 
lymphocyte act 


44.4 


31.4 


Coronery artery SMC 
TNFalpha + IL-lbeta 


24.7 


19.8 


CDS lymphocyte act |< 


41.2 


10.8 


Astrocytes rest 


11.7 


10.2 


Secondary CD8 
lymphocyte rest 


43.5 


?.9 


Astrocytes TNFalpha 
f IL-lbeta 


7.8 


J.8 


Secondary CD8 
lymphocyte act 


11.2 


u ] 

I 


£U-812 (Basophil) , 
est 


5.8 i 


1.3 


CD4 lymphocyte none ] 


[9.2 i 


«• : 


CU-812 (Basophil) . 
*MA/ionomycin 


1.9 i 


5.7 


2ry 

Thl/Th2/Trl_anti-CD95 A 
CH11 


t0.9 ] 


.1.2 J 


:CD1106 J 
Keratinocytes) none 


4.6 ] 


3.4 


LAK cells rest 2 


1.0 8 


( 

..0 ( 
1 


XD1106 

Keratinocytes) 5 
Walpha + IL-lbeta 


.7 2 


.1 ~ 
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LAX cells 1L-2 


23.0 


7777; rrr — «r «=» » iM^ " d 

13.0 jLiver cirrhosis J3.0 


4.0 


LAK cells IL-2+DL-12 


12.7 


1 s 


]\JpT UOQ9 ^on- 

iN\^i*-n>c.yz. none 




7 S 


LAK cells IL-2+EFN 
gamma 


1 A £ 

14.6 


5.6 


NCI-H292IL.4 


7.1 


8.0 


LAK cells IL-2+1L-18 


18.7 


7 7 




y. / 


66 


LAK cells 
PMA/ionomycin 


23.8 


14.3 


NO-H292 IL-13 


10.7 


6.3 


INiv Cells 1L-2 rest 


42.9 


35.8 


Nrn" hoq') tttkt 
in i»j.-xi.zy z ir in 

gamma 


3.2 


1.5 


Two Way MLR 3 day 


22.5 


9.9 


HPAEC none 


31.0 


13.9 


Two Wav MLR 5 dav 




3.3 


HPAECTNF alpha + 
EL-1 beta 


52.5 


31.9 


Two Way MLR 7 day 


21.2 


10.2 


Lung fibroblast none 


16.0 


7.7 


PBMCrest 


12.0 


6.8 


JLung tioroolast iNr 
alpha + IL-1 beta 


16.8 


9.6 


PBMCPWM 


19.3 


J.I 


JLung noroolast IL-4 


16.3 


7.6 


PBMC PHA-L 


29.9 


14.4 


Lung fibroblast IL-9 


23.2 


1 1 A 


Ramos (B cell) none 


19.3 


6.5 


Lung fibroblast IL-13 


13.8 


7.0 


ionomycin 


21.3 


13.7 


Lung fibroblast JFN 
gamma 


7.1 


6.1 


B lymphocytes PWM 


18.2 


9.9 


Dermal fibroblast 
CCD1070rest 


22.7 


36.6 


■I 1 Vmnfi or vtp q CT)A(W 

and IL-4 


26.4 


25.7 


Dermal fibroblast 
CCD1070TNF alpha 


63.7 


59.5 


EOL-1 dbcAMP 


29.3 


26.2 


Dermal fibroblast 
CCD1070IL-lbeta 


29.9 


19.3 


EOL-1 dbcAMP 
PMA/ionomycin 


23.0 


7.5 


Dermal fibroblast 
IFN gamma 


7.0 


5.6 


Dendritic cells none 


28.9 


17.6 


Dermal fibroblast 
EL-4 


20.6 


12.9 


Dendritic cell<i T 




2.8 


Dermal Fibroblasts 
rest 


15.2 


20.7 


Dendritic cells 
anti-CD40 

****** V>*^ IV 


10.6 


5.3 


Neutrophils 
FNFa+LPS 


18.4 


16.0 


Monocytes rest 


20.7 


7.6 } 


Neutrophils rest 


16.3 : 


>0.6 


Monocytes LPS 


18.2 


15.7 < 


Zolon J 


14.1 : 


5.9 


Macrophages rest ^ 


»0.0 I 


5.2 ] 


-img c 


>.9 : 


1.6 


Macrophages LPS ^ 


(.0 5 


1.0 1 


rhymus 1 


19.2 ', 


'.4 


HUVEC none f 


17.8 : 


il.9 I 


Cidney ] 


8.8 1 


1.6 


HUVEC starved t 


A.2 J 


;o.o j 
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Table GF. Panel 5 Islet 



Tissue Name 


Exp.O 

Ag4554, 

Run 

306350410 


Tissue Name 


Exp.(%) 
Ag4554, 
Run 

306350410 


97457JPatient-02go_adipose 


5.0 


94709_Donor 2 AM - A_adipose 


20.3 


97476JPatient-07sk_skeletal 
muscle 


U.KJ 


y4 / 1 u__i/onor z ajva - i5_aaipose 


1Z.O 


97477 JPatient-07ut_uterus 


5.4 


9471 l_Donor 2 AM - C_adipose 


9.5 


97478J > atient-07pl_placenta 


2.6 


94712JDonor 2 AD - A^adipose 


18.0 


99167 Jayer Patient 1 


100.0 


94713JDonor 2 AD- B_adipose 


34.4 


97482 JPatient-08ut_uterus 


2.4 


94714JDonor 2 AD - C_adipose 


17.3 


y /4o o jraaent-Uopi_piacenta 


1 Q 


94742_Donor 3 U - A_Mesenchymal 
Stem Cells 


in r\ 


97486_Patient"09sk_skeletal 
muscle 


3.4 


94743_Donor 3 U - B.Mesenchymal 
Stem Cells 


9.7 


97487 JPatient-09uUiterus 


3.4 


94730JDonor 3 AM - A^adipose 


29.1 


97488JPatient-09pLplacenta 


0.9 


94731_JDonor 3 AM - B_adipose 


47.0 


97492JPatient-10ut_uterus 


5.6 


94732_Donor 3 AM - C_adipose 


33.9 


97493_PatienMOpLplacenta 


6.0 


94733„Donor 3 AD - A_adipose 


46.3 


97495_PatienM lgo_adipose 


4.7 


94734_Donor 3 AD - B_adipose 


72.7 


97496 Patient-llsk skeletal 
muscle 


3.4 


94735 JDonor 3 AD - C_adipose 


13.7 


97497 Patient-llut uterus 


6.0 


77 1 38JLi verJHepG2untreated 


41.5 


97498„PatienM lpLpIacenta 


2.0 


73556JHfeart_Cardiac stromal cells 
(primary) 


8.5 


97500JPauent-12go_adirx)se 


8.7 


81735_Small Intestine 


18.0 


97S01 PatiVnt-17<:V skeletal 

muscle j 


14.2 


79400 TCiflnpv ProYi-mal f~Yknvn1ntf»r1 

Tubule 


9.3 


97502_Patient-12ut_uterus 


12.3 


82685_Small intestine JDuodenum 


20.2 


97503 JPatient-12pI_j>lacenta 


3.5 


90650_Adrenal_Adrenocortical 
adenoma 


10.1 


94721JDonor2U- 
A_Mesenchymal Stem Cells 


21.6 


72410jadneyJHROB 


16.8 


94722_Donor2U- 

B Jtfesenchymal Stem Cells 


6.3 


72411^Kidney_HRE 


6.8 


94723_Donor2U- 
CJvlesenchymal Stem Cells 


20.2 


73139JLFterusJUterine smooth 
muscle cells 


19.5 



CNS_neurodegeneration_vl.O Summary: Ag4554/Ag7230 Two experiments 
with different probe-primer sets are in excellent agreement. This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 
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However, no differential expression of this gene was detectecf be1w^n M ]WfKeiTher , s ' 
diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1 .4 for a discussion of this gene in treatment of central nervous system disorders. 
General jscreemng_paneLvl.4 Summary: Ag4554 Highest expression of this 
5 gene is detected in a ovarian cancer cell line (CT=25.4). Moderate levels of expression of 
this gene is also seen in cluster of cancer cell lines derived from pancreatic, gastric, colon, 
lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain 
cancers. Thus, expression of this gene could be used as a marker to detect the presence of 
these cancers. Furthermore, therapeutic modulation of the expression or function of this 

10 gene may be effective in the treatment of pancreatic, gastric, colon, lung, liver, renal, 
breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at 
moderate levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal 
muscle, heart, liver and the gastrointestinal tract. Therefore, therapeutic modulation of the 

15 activity of this gene may prove useful in the treatment of endocrine/metabolically related 
diseases, such as obesity and diabetes. 

In addition, this gene is expressed at high levels in all regions of the central nervous 
system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 
20 product may be useful in the treatment of central nervous system disorders such as 

Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

Interestingly, this gene is expressed at muph higher levels in fetal (CT=27.3) when 
compared to adult lung (CT=31.8). This observation suggests that expression of this gene 
25 can be used to distinguish fetal from adult lung. In addition, the relative overexpression of 
this gene in fetal tissue suggests that the protein product may enhance lung growth or 
development in the fetus and thus may also act in a regenerative capacity in the adult. 
Therefore, therapeutic modulation of the protein encoded by this gene could be useful in 
treatment of lung related diseases. 

30 Panel 4.1D Summary: Ag4554/Ag7230 Two experiments with different 

probe-primer sets are in excellent agreement. Highest expression of this gene is detected in 
lung microvascular endothelial cells (CTs=28-29). This gene is expressed at high to 
moderate levels in a wide range of cell types of significance in the immune response in 
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health and disease. These cells include members of theT^celf, B-tellf afidtfteliaTSdir 
macrophage/monocyte, and peripheral blood mononuclear cell family, as well as epithelial 
and fibroblast cell types from lung and skin, and normal tissues represented by colon, lung, 
thymus and kidney. This ubiquitous pattern of expression suggests that this gene product 
may be involved in homeostatic processes for these and other cell types and tissues. This 
pattern is in agreement with the expression profile in GeneraLscreening_panel_vl.4 and 
also suggests a role for the gene product in cell survival and proliferation. Therefore, 
modulation of the gene product with a functional therapeutic may lead to the alteration of 
functions associated with these cell types and lead to improvement of the symptoms of 
patients suffering from autoimmune and inflammatory diseases such as asthma, allergies, 
inflammatory bowel disease, lupus erythematosus, psoriasis, rheumatoid arthritis, and 
osteoarthritis. 

Panel 5 Islet Summary: Ag4554 Highest expression of this gene is detected in 
islet cells (CT=29,8)> This gene shows a widespread expression pattern which correlates 
with the pattern seen in panel 1 .4. Please see panel 1.4 for further discussion of this gene. 

H. CG143787-01: Disintegrin Protease. 

Expression of gene CG143787-01 was assessed using the primer-probe sets 
Ag6532, Ag6655 and Ag7048, described in Tables HA, HB and HC. Please note that 
CG143787-01 represents a full-length physical clone. 

Table HA. Probe Name Ae6532 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 

No 


Forward 


5 1 -atcatcaccaaagataccttttatctc-3 ' 


27 


474 


276 


Probe 


TET-5 • -agaaaccaaagtgcctgctgcaagc- 
3 » -TAMRA 


25 


501 


277 


Reverse 


5 ' -gtgttgtcattatatttgtaggaataggt- 
3' 


29 


526 

* 


278 


Tabh 


3 HB. Probe Name A&6655 




Primers 


Sequenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -atcatcaccaaagataccttttatctc-3 ' 


27 


474 


279 
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Probe 


I" -— ■ — p 

TET-5 ' -agaaaccaaagtgcctgctgcaagc- 
3 ' -TAMRA 


tETVUS 
25 


501 


280 


Reverse 


5 ' -gtgttgtcahtatatttgtaggaataggt- 
3 ' 


29 


526 


281 



Table HC. Probe Name Ag7048 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -acatcatcaccaaagatacctttta-3 ' 


25 


472 


|282 


Probe 


TET-5 • -caaagtgcctgctgcaagcacctatt 
-3 1 -TAMRA 


26 


507 


283 


Reverse 


5 1 -gttcccacacactggtgttg-3 ■ 


20 


549 


284 



General_screening_panel_vl.6 Summary: Ag6655/Ag7048 Expression of this 
gene is low/undetectable (CTs > 35) across all of the samples on this panel. 

10 Panel 4.1D Summary: Ag6655 Expression of this gene is low/undetectable (CTs 

> 35) across all of the samples on this panel. 

I. CG144112-01 : NEUROPSIN PRECURSOR. 

Expression of gene CG1441 12-01 was assessed using the primer-probe set Ag7123, 
described in Table IA. Please note that CG56663-01 represents a full-length physical clone. 
15 Table IA. Probe Name Ag7123 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gcctgggcaggaaatacac-3 1 


19 


353 


285 


Probe 


TET-5 ' -tacgcctgggagaccacagcctacag 
-3 ' -TAMRA 


26 


325 


286 


Reverse 


5 ? -tctcggggactgcacttct-3 ' 


19 


292 


287 



CNS_neurodegeneration_vl.O Summary: Ag7123 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

Panel 4.1D Summary: Ag7123 Expression of this gene is low/undetectable (CTs 
> 35) across all of the samples on this panel. 
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J. CG144112-04: Kallikrein-8. 

Expression of gene CG144112-04 was assessed using the primer-probe set Ag5271, 
described in Table JA. 

Table JA. Probe Name Ag5271 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ■ -gcagggcagggcgattct-3 1 


18 


97 


288 


Probe 


TET-5 1 -cacatcctggggctcagacccctgtg 
-3 • -TAMRA 


26 


153 


289 


Reverse 


5 1 -ctagaatcagcccttgctgccta-3 ' 


23 


245 


290 



CNS_neurodegeneration_vl.O Summary: Ag5271 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

Panel 4.1D Summary: Ag5271 Expression of this gene is low/undetectable (CTs 
> 35) across all of the samples on this panel. 

K. CG144686-01: MAST CELL CARBOXYPEPTEDASE A 
PRECURSOR. 

Expression of gene CG144686-01 was assessed using the primer-probe set Ag6864, 
described in Table KA. Results of the RTQ-PCR runs are shown in Tables KB and KC. 
Please note that CG144686-01 represents a full-length physical clone. 

Table KA. Probe Name Ae6864 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 • -aaccagtgagctccgaga-3 ' 


18 


122 


291 


Probe 


TET-5 ' -caaatttggttttctccttccagaatc 
c-3 ' -TAMRA 


28 


146 


292 


Reverse 


5 1 -tctgcacgttggctttat-3 • 


18 


177 


293 
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Table KB. General screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag6864, 
Run 

77fttR7547 


issue Name 


ReL 

Exp.(%) 
Ag6864, 
Run 


.riuipi/j>c 


ISO 


IXCJldJ Ld. 1 XV A U 




IVICldllUIJld X±b\JOO\l\J. 1 


yJ.D 


PTaHHpr 

ijjduuer 


00 


lYlCJallUllid iloUOO^D J» X 


n 7 


Onctrir* ra flivpr mpt ^ TVJPT-NR7 


0 0 


lYlGlaJlUlIld lvll*t 


o n 
u.u 


riflc*nV m TCATO TTT 
VJdbUlC Ca. XvrVl V-/ JXl 


00 
u.u 


lVIcJallUnia X^IAA. 11 VI V 1 


n o 
u.u 


L/OJ0I1 Co. O YY ~7 c tO 


00 

u.u 


1V1C1 dllUIIld oJV-lVLDJ_*-J 


U.U 


v^uion Ca. O W *foU 


oo 

u.u 


ot-[udJiJUUo ceil ucucinuind. 


o n 
u.u 


^/Ojon ca. ^w*fou mexj owozu 


00 
u.u 


1 esus r 001 


/ .0 


coion ca. xiizy 


00 

u.u 




u.u 


L,Ol0n Ca. lO 


0 o 
u.u 


r rosiaie Jrooi 




ooion ca. i^av^o-z 


u.u 


Placenta 


U.l 


Colon cancer tissue 


7H 7 
/U. / 


Uterus Pool 




P/%1^« /-.r> QW1 lift 

uoion ca. owiuo 


ft ft 
U.U 


vjvaiian ca. u v i^/vix-o 


u.u 


coion ca. i_oio-zuj 


ft ft 
U.U 


LJvanan ca. ox-uv-j 


u.u 


L,oion ca. jw-ho 


ft ft 

u.u 


UVaTlall Co. V V^/VI\.-^f 


u.u 


P*i1r\-rt P<-kr\1 

colon r OOl 


7R ^ 


Lfvanan ca. u v w»Jvo 


u.u 


Small Intestine Pool 


ft ft 
U.U 


uvanajni ca. ivjtkuv-i 


U.U 


Stomach Pool 


9ft ft 

zu.u 


vjvanan ca, kj v wvxv-o 


u.u 


Bone Marrow Pool 


91 9 


Ovary 


Z.Z) 


Jretai xiean 


A ft 


oreasi ca. jyiL-Jr-/ i 


ft n 
u.u 


raearc Jrooi 


9ft ft 
ZU.U 


RrMct r»a A/TP) A XyTR-O^ 1 ? 
OlCdcd Cd- 1VJL1J/\-1VJLD*-ZfJ JL 


u.u 


j-.yiiipii iNoue rooi 


inn ft 


oredsi ca. jo i 


n 7 
u. / 


Jreuai oKeierai iviuscie 




OlCdol Co.. X / jL/ 


n n 
u.u 


OKCieial lYXUoCIC XT OOl 4 


i < 

l.J 


JjJ Cd-oL Ld. IVJLL/iA 1 1 


n n 

u.u 


vr»lPPn P^/-\i~» 1 

opiccii ruui 


^ 0 

J.U 




u.u 


HiyiUUb J: UUl 


1R 7 


Trachea 


2.5 


PMC ronr(>r ( cr1in/^ictrr»\ T TR7_N/Tf~r 
V^lNo CdJlCcl U^lJU/aollU^ UO/ IVJLkJ 


n o 

u.u 


Lung 


2.7 


v^ino Cancer ^gllO/aoirO/ U ~ 1 1 0 1V1 vJ 


1 R 

1.0 


Fetal Lung 


5.3 


CNS cancer (neuro;met) SK-N-AS 


0.0 


Lungca. NCI-N417 


0.0 


CNS cancer (astro) SF-539 


0.0 


Lung ca. LX-1 


6.0 


CNS cancer (astro) SNB-75 


0.0 


Lungca.NCI-H146 


0.0 


CNS cancer (glio) SNB-19 


0.0 


Lungca.SHP-77 | 


4.5 


CNS cancer (glio) SF-295 


0.0 I 


Lungca. A549 


0.0. 


Brain (Amygdala) Pool 


0 : 0 


Lungca.NCI-H526 


0.0 


Brain (cerebellum) 


o.o 


Lungca.NCI-H23 jo.O 


Brain (fetal) 


0.O 


Lungca.NCI-H460 jO.O 


Brain (Hippocampus) Pool 


O.O 


Lungca.HOP-62 |0.9 


Cerebral Cortex Pool 


D.O 
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Lung ca. NCI-Hj22 


0.0 


Bram (Substantia nigra) Fool 


t\ rv*l< Jsu «mJ m 

O.U 


i 


Liver 


0.0 


Brain (Thalamus) Pool 


0.0 




Fetal Liver 


£. A 

o.O 


Bram (whole) 


0.0 




Liver ca. HepG2 


0.0 


Spinal Lord Fool 


A A 

0.0 




Kidney Pool 


51.4 


Adrenal Gland 


0.7 




Fetal Kidney 


1.1 


Pituitary gland Pool 


l.O 




Renal ca. 786-0 


0.2 


Salivary Gland 


0.0 




Renal ca. A498 


0.0 


Thyroid (female) 


0.2 




Renal ca. ACHN 


0.0 


Pancreatic ca. CAPAN2 


0.0 




Renal ca.UO-31 


0.2 


Pancreas Pool 


10.4 





Table KC. Panel 5 Islet 



Tissue Name 


feel. 

Ag6864 
Run 

30542485 
8 


Rel. 

Ag6864, 
Run 

30765049 
8 


Tissue Name 


Rel. 

Ag6864, 
Run 
3054248 
58 


Rel. 

Ag6864, 
Run 
3076504 
98 


97457_Patient-02go_adipos 
e 


5.5 


34.9 


94709_Donor2AM- 
A_adipose 


0.0 


0.0 


97476JPatient-07sleskeleta 
l muscle 


0.0 


0.0 


94710J>onor2AM- 
B_adipose 


0.0 


0.0 


97477JPatient-07ut_uterus 


1.4 . 


32.1 


9471 lJDonor 2 AM- 
C_adipose 


0.0 


0.0 


97478^Patient-07pLplacent 
a 


0.0 


4.7 


94712_Donor2AD- 
A^adipose 


0.0 


0.0 


99l67 w Bayer Patient I 


0.0 


0.0 


94713_Donor2AD- 
B_adipose 


0.0 


0.0 


97482 J>atient-08ut_uterus 


0.0 


0.0 


94714JDonor2AD- 
C_adipose 


2.3 


0.0 


97483JPatient-08pl_placent 
a 


0.0 


0.0 


94742_Donor3U- 

A3f esenchymal Stem Cells 


0.0 


0.0 


97486_Patient-09sleskeleta 
l muscle 


7.6 


15.5 


94743_Donor 3 U - 
BJtfesenchymal Stem Cells 


0.0 


0.0 


97487JPatient-09ut_uterus 


28.7 


11.2 


94730J)onor3AM- 
A_adipose 


0.0 


0.0 


97488_Patient-09pl_pIacent 
a 


1.4 


0.0 


9473 LDonor 3 AM - 
B_adipose 


O.O 


1.9 


WWJ^atient-lOutjiterus 


10.4 


7.2 


94732_Donor 3 AM - 
C_adipose 


0.0 


0.0 


97493JPatient-l0plj>lacent 
a 


0.0 


5.9 


94733JDonor3AD- 
A_adipose 


0.0 


0.0 


97495_PatienM lgo_adipos 
e 


20.0 


5.0 


94734JDonor 3 AD - 
B_adipose 


0.0 


0.0 
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97496J>atient-l lsk_skeleta 
1 muscle 


o.O 


8.7 


C_adipose 


0.0 


0.0 


97407 Patient 1 In* ntftm,, 


AC 1 

■ i .I 


65.1 


77138_JLiver HepG2untreate 
d 


0.0 


0.0 


97498_Patient-l IpLplacent 
a 


[O.U 


0.0 


73556 Heart Cardiac stromal 
cells (primary) 


5.1 


3.2 


97500JPatient-12go_adipos 

. e 


59.9 


59.9 


81735_Small Intestine 


73.2 


65.1 


97501_Patient-12sleskeleta 
1 muscle 


100.0 




100.0 


72409_Kidney_Proximal 
Convoluted Tubule 


0.0 


0.0 


97502_Patient-12ut_uterus 


29.1 


97.3 


82685_Small 
intestine_Duodenum 


59.0 


67.4 


97503_Patient-12pLplacent 
a 


5.0 


2.3 


yuoDU_Aarenal_Adrenocortic 
al adenoma 


0.0 


0.0 


94721_Donor2U- 
AJVIesenchymal Stem 
Cells 


0.0 


0.0 


72410JKidney_HRCE 


0.0 


0.0 


94722JDonor2U- 
B_Mesenchymal Stem 
Cells 


10 


0.0 


72411_Kidney_HRE 


0.0 


D.O 


94723 JDonor2U- 
C_Mesenchymal Stem j 
Cells 


1.5 ( 


3.0 


73139_Uterus_Uterine 
;mooth muscle cells 


).0 j{ 


).0 



General_screening_panel_vl.6 Summary: Ag6864 Highest expression of this 
gene is seen in lymph node (CT=29). Moderate levels of expression are also seen 
predominantly in normal tissue, including adipose, colon, heart, thymus, prostate, and 
kidney, as well as in colon cancer tissue. Thus, expression of this gene could be used to 
identify these samples and tissues. Modulation of the expression of this gene may also be 
effective in the treatment of diseases of these tissues, including cancer, obesity and 
diabetes. 

Panel 5 Islet Summary: Ag6864 Two experiments with the same probe and 
primer produce results that are in excellent agreement. Highest expression of this gene is 
seen in skeletal muscle (CTs=33.5). Please see Panel 1.6 for discussion of this gene. 

L. CG144906-01: TESTKIN PRECURSOR. 



Expression of gene CG144906-01 was assessed using the primer-probe set Ag6915, 
described in Table LA. Please note that CG144906-01 represents a full-length physical 
clone. 
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Table LA. Probe Name Ag6915 



Primers 


Sequence * 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -catgccatcctccacattt-3 1 


19 


337 


294 


Probe 


TET-5 ' -cagcagtctgfcccggttctcaaactc 
-3 1 -TAMRA 


26 


356 


295 


Reverse 


5 1 -gtgcctcatcctctttgatgta-3 1 


22 


398 


296 



GeneraLscreening_paneLvl.6 Summary: Ag6915 Expression of this gene is 
lowAmdetectable (CTs > 35) across all of the samples on this panel. 

M. CG144997-01: RNase H I. 

Expression of gene CG144997-01 was assessed using the primer-probe set Ag7057, 
described in Table MA. Results of the RTQ-PCR runs are shown in Table MB. Please note 
that CG144997-01 represents a full-length physical clone. 

Table MA. Probe Name Ag7057 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -gtaaacgccgattcctgct-3 1 


19 


468 


297 


Probe 


TET-5 1 -cttctacgcccattactggagcagca 
-3 1 -TAMRA 


26 


493 


298 


Reverse 


5 1 -gaatgagtgcagagacacgttt-3 ' 


22 


558 


299 



Table MB. General screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag7057, 
Run 

282273884 


issue Name 


Rel. 

Exp.(%) 
Agf7057, 
Run 

282273884 


Adipose 


3.9 J 


Renal ca. TK-10 


33.9 


Melanoma* Hs688(A).T 


23.8 


Bladder 


15.7 


Melanoma* Hs688(B).T 


28.3 


Gastric ca. (liver met.) NCI-N87 


49.0 


Melanoma* M14 


50.7 


Gastric ca. KATO EH 


100.0 


Melanoma* LOXMVI 


57.8 


Colon ca. SW-948 


11.4 


Melanoma* SK-MEL-5 


51.4 


Colon ca. SW480 


76.3 


[Squamous cell carcinoma SCC-4 


22.5 


Colon ca.* (SW480 met) SW620 


34.9 
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Testis Pool 


9.0 


Colon ca.HT§ CT/ ' ySOg 


•j*^ ^1 «]L 


Prostate ca.* fbone met} PC-3 


60.3 


Colon ca. HCT-116 


36.6 


Prostate Pool 


5.4 


Colon ca. CaCo-2 


42.0 


Placenta 


4.5 


Colon cancer tissue 


17.6 


Uterus Pool 


1.9 


Colon ca.SWl 116 


5.4 


Ovarian ca OVPAR-3 


31.2 


Colon ca. CoIo-205 


10.4 


Ovarian ca STC-OV-3 


31.4 


Colon ca. SW-48 


6.8 


Ovarian ca OVCAR-4 


17.1 


Colon Pool 


9.5 


Ovarian ca OVCAR-S 


39.0 


Small Intestine Pool 


5.7 


Ovarian ca TGROV-1 

vy Yuiitui v*ci. iv>xw t x 


13.3 


Stomach Pool 


5.1 


Ovarian ra OVCAR-S 


15.0 


Bone Marrow Pool 


3.3 


Ovary 


4.9 


Fetal Heart 


4.7 


T?Tea«;t ra 1VTPF-7 


21.8 


Heart Pool 


4.5 


DiCdol KI. lYJUL/^i. i.VJLD £jl 


17.3 


Tvmnh Node Pool 


8.9 


"Rrea^t ra RT 549 


24.8 


Fetal Skeletal Muscle 


4.0 


Breast ca T47T> 


9.5 


Skeletal Muscle Pool 

iWsl v LGIX J.™ 1 Ul)VJV> X Vwi 


2.3 


Rreasr ra MDA-N 


22.7 


Snleen Pool 


4.1 


jyiuaol L \j\jk 


12.3 


Thvmns Pool 


8.2 




7 3 


CN52 rancer f alio/astro^ U87-MG 


55.5 


T liner 


1 9 


CN5? cancer fVKo/asrroi U-118-MG 


49.7 


T^Atal T nnty 
JTCulJ J_/Uii£ 




PNS cancer fneuro'met^ SK-N-AS 


49.7 


T una na NrT-NT417 


10 1 


CNS cancer Castro^ SF-539 


22.1 


T lino* r*si T ^C- 1 


22 4 


CNS cancer fastro^ SNB-75 


45.1 


T una TsJf , T-W14fk 


\ 1 9 


CN5> rancer CtrHo^ SNB-19 


16.7 


T una RHP-77 


82 9 


PNS rancer T^lio^ SF-295 


56.6 


T lvno n<* A^4Q 


54.0 


"Rratn f Amvpdala^ Pool 


7.3 


T una ca NCT-H526 

.LiLillg L-tt. J. l V^ X JL X»y x» V/ 


89 


Rrain f cerebellum^ 


20.0 


T una ra NCT-W? 


37 9 


Brain rfetaH 


8.0 




37 1 

ml 1 • JL 


Rniin ^HinoocamDu^ PodI 


8.1 


T una ra HOP-6? 


12.1 


Cerebral Cortex Pool 


12.0 


T una ca NCMx522 


56.6 


Rrain Substantia niora^ Pool 


6.7 


T iv<»T 


0.8 


Brain rThalamus^ Pool 

X11Q1U ^ A lluiuiiiujy X UWA 


12.1 


Petal T Jver 

JL wLOl X_j1 t^I 


6.7 


Brain (whole) 


7.1 


JLaVCx L-d. 1XCJJVJX 


18.6 


Sninal Cord Pool 

tyL/lljcu v_^vy l vj x 


6.7 


Kidney Pool 


10.8 


Adrenal Gland 


6.9 


Fetal Kidney 


5.8 


Pituitary gland Pool 


2.9 


Renal ca. 786-0 


21.6 


Salivary Gland 


2.6 


Renal ca. A498 


17.1 


Thyroid (female) 


2.5 


Renal ca. ACHN 


17.6 


Pancreatic ca. CAPAN2 


23.3 


Renal ca.UO-31 


18.0 


Pancreas Pool 


6.0 



GeneraLscreeningjpanel_vl.6 Summary: Ag7057 Highest expression of this 
gene is detected in a gastric cancer cell line (CT=27). Moderate levels of expression of this 
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gene is also seen in cluster of cancer cell lines derived frdmYadcfea^fcrP§ffic; colbfl;l&fg,^ 
liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain 
cancers. Thus, expression of this gene could be used as a marker to detect the presence of 
these cancers. Furthermore, therapeutic modulation of the expression or function of this 
gene may be effective in the treatment of pancreatic, gastric, colon, lung, liver, renal, 
breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at 
moderate levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal 
muscle, heart, liver and Jhe gastrointestinal tract. Therefore, therapeutic modulation of the 
activity of this gene may prove useful in the treatment of endocrine/metabolically related 
diseases, such as obesity and diabetes. 

In addition, this gene is expressed at moderate levels in all regions of the central 
nervous system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 
product may be useful in the treatment of central nervous system disorders such as 
Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

N. CG145494-01: PRESTIN. 

Expression of gene CG145494-01 was assessed using the primer-probe sets 
Ag6694, Ag7803 and Ag7797, described in Tables NA, NB and NC. Results of the 
RTQ-PCR runs are shown in Table ND. 

Table NA. Probe Name Ag6694 



Primers 


Sequeces 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -ggcacagaggccagagat-3 » 


18 


559 


300 


Probe 


TET-5 • -gtgaccttactttcaggaatcattcagt 
tttgc-3 1 -TAMRA 


33 


604 


301 


Reverse 


5 1 -ggctctgtgagatatatggcc-3 1 


21 


663 


302 
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Table NB. Probe Name Ag7803 



Primers 


Sequencs 


Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 ' -ggagaaccagcaaaatagagct-3 1 


22 


1367 


303 


Probe 


TET-5 1 -ccaatcccaggaacaaggaggacaca 
a-3 ' -TAMRA 


27 


1409 


304 


Reverse 


5 ' -atcacagcagtgatcaaacca-3 ' 


21 


1440 


305 



Table NC. Probe Name Ae7797 



Primers jSequenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -ccatctggcttaccacttttg-3 1 


21 


1391 


306 


Probe 


TET-5 ' -cacagcagtgatcaaaccatagtccaa 
tCC-3 ■ -TAMRA 


30 


1429 


307 


Reverse 


5 1 -aaatcacagtcagcagagcaat-3 1 


22 


1462 


308 



10 

Table ND. General screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag6694, 
Run 

277223811 


issue Name 


Rel. 

Exp.(%) 
Ag6694, 
Run 

277223811 


Adipose 


0.0 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.0 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. (liver met.) NCI-N87 


0.0 


Melanoma* M14 


0.0 


Gastric ca. KATO HE 


0.0 


Melanoma* LOXMVI 


0.0 


Colon ca.SW-948 


0.0 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


0.0 


Squamous cell carcinoma SCC-4 


0.0 


Colon ca.* (SW480 met) SW620 


0.0 


Testis Pool 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


100.0 


Colon ca. HCT-116 


0.0 


Prostate Pool 


0.9 


Colon ca. CaCo-2 


0.0 


Placenta 


0.0 


Colon cancer tissue 


0.0 


Uterus Pool 


0.0 


Colon ca.SW1116 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Colon ca. Colo-205 


0.0 


Ovarian ca. SK-OV-3 


0.0 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR-4 


0.0 


Colon Pool 


0.0 


Ovarian ca. OVCAR-5 


0.0 


Small Intestine Pool 


o.o 
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Uvanan ca. lOrKUv-i 


n n 
U.U 


otomacn r ooi 


U.U 


1 


Uvanan ca. OVCAK-o 


a n 


Bone Marrow Pool 


A A 

U.U 




Ovary 


a a 
U.U 


retai wean 


ft A 
U.U 




breast ca. Mtr-/ 


a n 
U.U 


Heart r ooi 


A A 

U.U 




breast ca. MIJA-iyld-Zo i 


a a 
U.U 


Lymph Node Pool 


A A 
U.U 




Breast ca. BT 54y 


A A 

U.U 


retal oKejetaj Muscle 


A A 
U.U 


breast ca. T47L> 


A A 

U.U 


oKeietai Muscle Fool 


A A 
U.U 


breast ca. Mx/A-rs 


a a 
U.U 


opieen rooi 


A A 
U.U 


breast Pool 


a a 

U.U 


Thymus Pool 


A A 

0.0 


Trachea 


i a 

1.0 


CNS cancer (glio/astro) U87-MG 


A A 

0.0 


Lung 


A A 


CJNo cancer (glio/astro) U-lls-MU 


A A 

U.U 


retal Lung 


2.9 


CNS cancer (neuro;met) SK-N-Ao 


A A 

0.0 


T „„ XT/""*T X7>1 "1*7 

Lung ca. JNCI-N417 


A A 

0.0 


LJNb cancer (astro) or-559 


A A 

U.U 


Lung ca. LX-1 


0.0 


CNS cancer (astro) SNB-75 


0.0 


T ^ XT/IT T T ■* it £. 

Lung ca. NCI-H146 


0.0 


CNS cancer (gho) SNB-19 


0.0 


Lung ca. SHP-77 


0.0 


CNS cancer (gho) SF-295 


0.0 


r .... a c a f\ 

Lung ca. A549 


A f\ 

0.0 


Brain (Amygdala) Pool 


A A 

0.0 


Lung ca. NCI-H526 


0.0 


Brain (cerebellum) 


14.6 


Lung ca. NCI-H23 


0.0 


Brain (fetal) 


0.0 


Lung ca. NCI-H460 


0.0 


Brain (Hippocampus) Pool 


0.0 


Lungca. HOP-62 


0.0 


Cerebral Cortex Pool 


0.0 


Lung ca. NCI-H522 


0.0 


Brain (Substantia nigra) Pool 


0.0 


Liver 


0.0 


Brain (Thalamus) Pool 


0.0 


Fetal Liver 


o.p 


Brain (whole) 


0.0 


Liver ca. HepG2 


0.0 


opinai i^oru r ooi 


U.U 


Kidney Pool 


0.0 


Adrenal Gland 


0.0 


Fetal Kidney 


0.0 


Pituitary gland Pool 


0.0 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


0.0 


Thyroid (female) 


0.0 


Renal ca. ACHN 


0.0 


Pancreatic ca. CAPAN2 


0.0 


Renal ca.UO-31 j|0.0 


Pancreas Pool 


0.0 | 



CNS_neurodegcneration_vl,0 Summary: Ag7797 Expression of this gene is 
low/undetectable (CTs > 34.7) across all of the samples on this panel. 

GeneraLscreeningjpaneLvl.6 Summary: Ag6694 Moderate level of expression 
of this gene is restricted to prostate cancer cell line (CT=32.6). Therefore, expression of 
this gene may be used to distinguish this sample from other samples in this panel and also 
as diagnostic marker to detect the presence of prostate cancer. In addition, therapeutic 
modulation of this gene may be useful in the treatment of prostate cancer. 
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Panel 4.1D Summary: Ag7803 Expression of this gene is IbWuMefectame^eTs -=* 
> 35) across all of the samples on this panel. 

O. CG145722-01: WEEl-like protein kinase. 

Expression of gene CG145722-01 was assessed using the primer-probe set Ag6231, 
5 described in Table OA. Results of the RTQ-PCR runs are shown in Table OB. 
Table OA. Probe Name Ae6231 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -gcttcctggctaatgagatttt-3 ' 


22 


1339 


309 


Probe 


TET-5 1 -agaggattaccggcaccttcccaaag 
-3 ' -TAMRA 


26 


1364 


310 


Reverse 


5 * -tgttaatcccaaggcaaatatg-3 1 


22 


1394 


311 



10 

Table OB. General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag6231, 
Run 

259211049 


issue Name 


ReL 

Exp.(%) 
Ag6231, 
Run 

259211049 


Adipose 


0.0 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.0 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. (liver met.) NC3-N87 


0.0 


Melanoma* M14 


0.0 


Gastric ca. KATO HI 


0.0 


Melanoma* LOXIMVI 


0.0 


Colon ca. SW-948 


0.0 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


0.0 


Squamous cell carcinoma SCC-4 


0.0 |CoIon ca * (SW480 met) SW620 


0.0 


Testis Pool 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


0.0 


Colon ca.HCT-1 16 


0.0 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


97.3 


Placenta 


0.0 


Colon cancer tissue 


0.0 


Uterus Pool 


0.0 


Colon ca. SW1116 


0.0 


Ovarian ca. OVCAR-3 


0.0 


Colon ca. Colo-205 


0.0 


Ovarian ca. SK-OV-3 


0.0 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR-4 


0.0 


Colon Pool 


0.0 


Ovarian ca. OVCAR-5 


0.0 


Small Intestine Pool 


0.0 


Ovarian ca. IGROV-1 


0.0 (Stomach Pool 


0.0 


Ovarian ca. OVCAR-8 


0.0 |Bone Marrow Pool 


0.O 
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Ovary 


A A 

u.u 


retai xiean 


A;<^| ulL< M*>jt .il 1 ' aj 


Jtsreast ca. JviLJr-/ 


n n 


xiean r ooi 


A A 

u.u 


DlcaSl Ca. iVJUL^/Y-lVJLD -AO 1 


n n 

u.u 


i^ynipn in otic x ooi 


u.u 


oreast ca. r> i 04y 


u.u 


Jretax oKeieiai jviuscie 


A A 
U.U 


nreasi ca. 


u.u 


oKeietai jviuscje r ooi 


A A 
U.U 


ureasi ca. mjl//\-in 


a n 
U.U 


opieen Jr ooi 


A A 

u.u 


Breast Pool 


a A 
u.u 


Thymus Pool 


A A 

u.u 


Trachea 


U.U 


lino cancer (giio/astro) Uo/-Mvj 


A A 
U.U 


Lung 


A A 
U.U 


UNo cancer (giici/asixo) u-iio-Jviij 


A A 
U.U 


Fetal Lung 


a n 
U.U 


ljno cancer (neuro,metj oJv-jn-Ao 


A A 
U.U 


T n«/> "M/T XT/1 1 T 

Lung ca. INCJHN41 / 


A A 
U.U 


CJNo cancer (astro) orojy 


A A 
U.U 


Lung ca. LX-1 


A A 

U.U 


UNo cancer (astro J oiNr5-/j 


4.2 


T irrtrt r*n 'KlF r T T_J1 A d 

Lung ca. iNd-jii4o 


1 Art A 


CiNo cancer (giio) olNB-iy 


A A 
U.U 


Lung ca. Mir-/ / 


Z.3 


LJNv> cancer (glio) kF-295 


A rv 

U.U 


Lung ca. Ao4y 


A A 

U.U 


Brain (Amygdala) Pool 


2.3 


Lung ca. INUl-rijZo 


A A 
U.U 


Brain (cerebellum) 


3.0 


Lung ca. INLl-xi23 


A A 
U.O 


Brain (tetal) 


2.0 


Lung ca. NL1-H46U 


A A 

U.U 


Brain (Hippocampus) Pool 


A A 

U.U 


Lung ca. HOP-62 


A A 

0.0 


Cerebral Cortex Pool 


0.0 


Lung ca. NCI-H322 


A A 

0.0 


Brain (Substantia nigra) Pool 


A A 

0.0 


liver 


A A 
0.0 


Brain (Thalamus) Pool 


A A 

0.0 


Fetal Liver 


A A 

0.0 


Brain (whole) ^ 


3.7 


-La ver ca. xiepvjrz. 


u.u 


opmai v>ora a ooi 


u.u 


Kidney Pool 


1.8 


Adrenal Gland 


0.0 


Fetal Kidney 


2.2 


Pituitary gland Pool 


0.0 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


0.0 


Thyroid (female) 


0.0 


Renal ca. ACHN 


0.0 


Pancreatic ca. CAPAN2 


4.6 


Renal ca.UO-31 


6.0 


Pancreas Pool 


0.0 



CNS_neurodegeneration_vl.O Summary: Ag6231 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

5 General_screening_panel_vl.5 Summary: Ag6231 Low levels of expression of 

this gene is restricted to a lung cancer and a colon cancer cell lines (CTs=32.2). Therefore, 
expression of this gene may be used to distinguish these cell lines from other samples in 
this panel and also as diagnostic marker to detect the presence of colon and lung cancers. In 
addition, therapeutic modulation of this gene may be useful in the treatment of these 
10 cancers. 

Panel 4.1D Summary: Ag6231 Expression of this gene is low/undetectable (CTs 
:> 35) across all of the samples on this panel. 
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P. CG145754-02: 1MLIKREIN 7 PR^Moi^ SOB/31373 

Expression of gene CG145754-02 was assessed using the primer-probe set Ag7038, 
described in Table PA. Results of the RTQ-PCR runs are shown in Tables PB and PC. 
Please note that CG145754-02 represents a full-length physical clone. 
5 Table PA. Probe Name Ap7038 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 * -tgttaatgacctcaagctcatctc-3 ' 


24 


342 


312 


Probe 


TET-5 • -ccccaggactgcacgaaggtttacaa 
-3 ' -TAMRA 


26 


367 


313 


Reverse 


5 ' -tttcttggagtcggggatg-3 • 


19 


426 


m — 



Table PB. General screening panel vl.fi 



Tissue Name 


Rel. 

Exp.(%) 
Ag7038, 
Run 

282273672 


issue Name 


Rel. 

Exp.(%) 
Ag7038, 
Run 

282273672 


Adipose 


1.6 


Renal ca.TK-10 


0.0 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.0 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. (liver met.) NCI-N87 


100.0 


Melanoma* M14 


0.0 


Gastric ca. KATO EQ 


22.1 


Melanoma* LOXIMVI 


0.0 


Colon ca. SW-948 


4.4 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


10.5 


Squamous cell carcinoma SCC-4 


3.0 


Colon ca.* (SW480 met) SW620 


0.0 


Testis Pool 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


0.0 


Colon ca.HCT-1 16 


9.7 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


0.0 


Placenta 


0.0 


Colon cancer tissue 


0.6 


Uterus Pool 


0.0 


Colon ca.SWl 116 


38.7 


Ovarian ca. OVCAR-3 


4.1 


Colon ca. CoIo-205 


0.0 


Ovarian ca. SK-OV-3 


0.0 


Colon ca. SW-48 


0.O 


Ovarian ca. OVCAR-4 


3.1 


Colon Pool 


5.0 


Ovarian ca. OVCAR-5 


3.0 : 


Small Intestine Pool 


3.0 


Ovarian ca. IGROV-1 


3.0 I 


Stomach Pool ( 


3.0 


Qvarian ca. OVCAR-8 


).0 I 


Jcme Marrow Pool • ( 


).0 


Ovary [ 


).0 I 


7 etal Heart ( 


U3 


Breast ca. MCF-7 f 


).0 I 


lean Pool ~~~ c 


1.0 



391 



WO 03/029424 PCT7US02/31373 



Breast ca. MDA-MB-231 


0.0 jLymph NbdSpft 1 ^" 805 




Breast ca. BT 549 


0.0 jFetal Skeletal Muscle 


0 O 


Breast ca. T47D 


0.0 {Skeletal Muscle Pool 


00 


Breast ca.MDA-N 


0.0 JSpleenPool 


00 


Breast Pool 


0.0 jThymus Pool 


0 0 


Trachea 


0.0 JCNS cancer (glio/astro) U87-MG 


on *~™~~* 


Lung 


0.0 |CNS cancer (glio/astro) U-118-MG 


0 0 


Fetal Lung 


0.0 jCNS cancer (neuro;met) SK-N-AS 


0 0 


Lung ca. NCI-N417 


0.0 


jCNS cancer (astro) SF-539 


0 o 


Lung ca. LX-1 


0.5 


|CNS cancer (astro) SNB-75 


9 O 


Lung ca. NCI-H146 


0.0 


CNS cancer (gIio)SNB-19 


O 0 

l/.U 


Lung ca. SHP-77 jo.O 


CNS cancer (glio) SF-295 


O 0 


Lung ca. A549 Jo.O 


J3idjn v Ainygflaia ) rooi 


1 < 

I.J 


Lungca.NCI-H526 


0.0 


Brain (cerebellum) 




Lung ca. NCI-H23 


4.2 


Brain (fetal) 




Lung ca. NCI-H460 


b.o 


Brain (Hippocampus) Pool 


4 O 

S -T.\J 


Lung ca. HOP-62 


0.0 


Cerebral Cortex Pool 




Lung ca.NCI-H522 


0.0 


Brain (Substantia nigra) Pool 


1 A. 


Liver 


0.0 


Brain (Thalamus) Pool 




Fetal Liver 


0.0 


Brain (whole) 


ft o 
J.Z 


Liver ca. HepG2 


o.o 


Spinal Cord Pool 


_== 


Kidney Pool 


10 


Adrenal Gland 


10 


Fetal Kidney 


1.3 


Pituitary gland Pool 


3.0 


Renal ca. 786-0 ( 


).0 


Salivary Gland 


)0 


Renal ca. A498 ( 


).6 


Thyroid (female) ( 


).0 


Renal ca: ACHN ( 


).0 


Pancreatic ca. CAPAN2 : 


1.2 


Renal ca.UO-31 ( 


).0 


Pancreas Pool ( 


).0 



Table PC. Panel 5 Mot 



Tissue Name 


Rel. 

Exp.(%) 

Ag703, 

Run 

305424861 


Tissue Name 


Rel. 

Exp.(%) 
Ag7038, 
Run 

305424861 


97457_Patient-02go_adipose 


3.0 


94709_Donor2AM-A adipose 


0.0 


97476_Patient-07sk_skeletal 
muscle 


0.0 


94710_Donor 2 AM - B_adipose 


100.0 


97477_Patient-07ut_uterus 


0.0 


94711 Donor 2 AM - C adipose 


0.0 


97478_Patient-07pl_placenta 


0.0 


94712 J>onor 2 AD - A_adipose 


0.0 


99167_Bayer Patient 1 


0.0 


94713_Donor 2 AD - B_adipose 


0.0 1 


97482_Patient-08ut_uterus 


0.0 


94714 Donor 2 AD - C adipose 


0.0 J 
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97483 JPatient-08pLpIacenta 


0.0 


-, ... ir'Y ipa -ip ..• ir n ir-j. ,. 

94742_Donof 3 U - W\ JS^^^Sal 
Stem Cells 


~jt JL ,^\\ .jr £ 


97486_Patient-09sk u _skeletal 
muscle 


0.0 


94743 _J>onor 3 U - BJtfesenchymal 
Stem Cells 


5.5 


97487 JPatient-09ut_uterus 


0.0 


94730_Donor 3 AM - A_adipose 


0.0 


97488JPatient-09pLplacenta 


0.0 


94731 JDonor 3 AM -B_adipose 


0.0 


97492 JPatient-10ut_uterus 


0.0 


94732 JDonor 3 AM - C_adipose 


0.0 


97493 JPatient-lOpLpIacenta 


0.0 


94733JDonor 3 AD - A_adipose 


0.0 


97495 JPatient-1 lgo_adipose 


2.7 


94734_Donor 3 AD - B_adipose 


0.0 


97496JPatient-l lsk_skeletal 
muscle 


U.U 


9473:>_Donor 3 AD - C_adipose 


0.0 


97497 J?atient-1 lut_uterus 


0.0 


77138_Liver_HepG2untreated 


0.0 


y /H-yo^rdueni-i ipi__piacenia 


U.U 


73556_Heaft_Cardiac stromal cells 
(primary) 


0.0 


97500JPatient-12go_adipose 


1.5 


81735_Small Intestine 


0.0 


97501J>atient-12sk_skeletal 
muscle 


0.0 


72409 JtfdneyJProximal Convoluted 
Tubule 


2.4 


97502 JPatient-12utjiterus 


1.0 


82685JSmall intestine_Duodenum 


0.0 


y / Ji/j _jraueni>izpi_pjacenia 


U.U 


90650_AdrenaLAdrenocortical 
adenoma 


0.0 


94721JDonor2U- 
AJtfesenchymal Stem Cells 


0.0 


72410JKidney_HRCE 


5.7 


94722_Donor2U- 
BJtfesenchymal Stem Cells 


0.0 


72411JCidney_JHRE 


10.2 


94723_Donor2U- 
C_Mesenchymal Stem Cells 


5.0 


73139_Uterus_Uterine smooth 
muscle cells 


5.0 



General_screening_paneLvl.6 Summary: Ag7038 Highest expression of this 
gene is detected in a gastric cancer NCI-N87 cell line (CT=3L3). Expression of this gene 
seems to be restricted to number of colon and gastric cancer cell lines. Therefore, 
expression of this gene may be used to distinguish colon and gastric cancer cell lines from 
other samples in this panel and also as a diagnostic marker to detect the presence of colon 
and gastric cancers. In addition, therapeutic modulation of this gene may be useful in the 
treatment of colon and gastric cancer. 

Panel 5 Islet Summary: Ag7038 Low levels of expression of this gene is 
restricted to adipose tissue (CT=33). Therefore, expression of this gene may be used to 
distinguish this adipose sample from other samples in this panel. In addition, therapeutic 
modulation of this gene may be useful in the treatment of metabolic diseases such as 
obesity and diabetes. 
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Another experiment (Run 307650500) with this profie-Jrimyiilniwed 3 A 31 7 ' 3 
low/undetectable (CTs > 35) across all of the samples on this panel. 

Q. CG145754-03: KalIikrein-7. 

Expression of gene CG145754-03 was assessed using the primer-probe set Ag5272, 
5 described in Table QA. Results of the RTQ-PCR runs are shown in Table QB. 
Table OA. Probe Name Ag5272 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ■ -ggcagccaggggtgacaa-3 1 


18 


119 


315 


Probe 


TET-5 ' -cgccccatgtgcaagaggctccc-3 
1 -TAMRA 


23 


149 


316 


Reverse 


5 ' -cctccgcagtggagctgatt-3 ■ 


20 


201 


317 



10 

Table OB. Panel 4.1 D 



Tissue Name 


Rel. 
Ep.(%) 
Ag5272, 
Run 

230500478 


Tissue Name 


Rel. 

Exp.(%) 
Ag5272, 
Run 

230500478 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 




Secondary Thl rest 


0.0 jHUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 jHUVECIL-li 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
ILlbeta 


1.3 


Primary Th2 rest 


0.0 


Small airway epithelium none 


100.0 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
1- IL-lbeta 


46.7 


CD45RA CD4 lymphocyte act 


3.6 


Coronery artery SMC rest 


3.0 


CD45RO CD4 lymphocyte act 


» 


Coronery artery SMC TNFalpha + 
IL-lbeta 


).0 
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CD8 lymphocyte act 


0.0 


Astrocytes rit^ 1 *' Uo0li 


i» Q «Jtt jit' 


Secondary CD8 lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL-lbeta 


o.o 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.O 


\sJ-s l t jtyiiipuucyie none 




KU-812 (Basophil) 
PMA/ionomycin 




2ry Thl/Th2/Trl_anti-CD95 
CH11 


0.0 


CCD 1 106 (Keratinocytes) none 


14.2 


LAK cells rest 


0.0 


CCD1 106 (Keratinocytes) 
TNFalpha + IL-lbeta 


4.5 


TAT? rv^llc TT O 
L*r\S\ CeiiS JUL.-Z 


0.0 


Liver cirrhosis 


0.0 


i-^rVXS. Leila 1 1 -.-/-t U r - i /• 


0.0 


NCl-H292none 


ft A 

u.u 


TAT? r>oHo TT / 7>||-'I\T rrQmma 

jurtjs. ceiis JJ-r-/.+iriN gamma 


0.0 


NCI-H292IL-4 


ft r\ 
U.U 


LAK cells IL-2+DL-18 


0.0 


NCI-H292 IL-9 


0.0 


LAK cells PMA/ionomycin 


0.0 1NCI-H292IL-13 


0.6 


NK Cells IL-2 rest 


0.0 |NCI-H2921RSr gamma 


0.0 


Two Way MLR 3 day 


0.0 


HPAEC none 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + IL-1 beta 


0.0 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


0.0 


PBMCrest 


0.0 


Lung fibroblast TNF alpha + IL-1 
beta 


0.0 


PBMCPWM 


0.0 


Lung fibroblast IL-4 


i 

0.0 


PBMC PHA-L 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


o.o ; 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 rest 


0.0 


o iympnocyces i^i/*fv/J-> ana iL-^f 


0.0 


Dermal fibroblast CCD1070 TNF 
alpha 


0.0 


EOL-1 dbcAMP 


0.0 


Dermal fibroblast CCD1070 IL-1 
beta 


0.0 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


Dermal fibroblast IFN gamma 


0.0 


Dendritic cells none 


0.0 


Dermal fibroblast IL-4 


0.0 


uenannc ceus JUro 


0.5 


Dermal Fibroblasts rest 


O.Q 


Dendritic cells anti-CD40 


0.0 


Neutrophils TNFa+LPS 


J,\J 


Monocytes rest 


0.0 


Neutrophils rest 


D.0 


Monocytes LPS 


D.0 


Colon 


3.0 


Macrophages rest 


3.0 


Lung 


3.0 


Macrophages LPS 


3.0 


Thymus ( 


3.0 


HUVEC none ( 


).0 


Sidney 


11. 2 


HUVEC starved ( 


).0 







Panel 4.1D Summary: Ag5272 Highest expression of this gene is seen in resting 
small airway epithelium (CT=32). Significant expression of this gene is also seen in 
cytokines TNF-a and EL-lb treated small airway epithelium. Therefore, modulation of the 
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expression or activity of the protein encoded by this trandlcrfpt fhfov^nh^d^plic^tfic* of jr ~* 
small molecule therapeutics may be useful in the treatment of asthma, COPD, and 
emphysema. 

R. CG146279-01: Potassium channel subfamily K member 10. 

5 Expression of gene CG146279-01 was assessed using the primer-probe set Ag6035, 

described in Table RA. Results of the RTQ-PCR runs are shown in Tables RB, RC, RD and 
RE. 

Table RA, Probe Name Ag6035 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -atgaaatttccaatcgagacg-3 1 


21 


61 


318 


Probe 


TET-5 ' -ctaaagtggccgttcccgcagc-3 » 
-TAMRA 


22 


107 


319 


Reverse 


5 ' -ggggttgcccgttagtg-3 ' | 


17 


156 


320 



Table RB.CNS neurodegeneration vl.0 



15 



Tissue Name 


Rel. 

Exp.(%) 
Ag6035, 
Run 

225246892 


issue Name 


Rel. 

Exp.(%) 
Ag6035, 
Run 

225246892 


AD 1 Hippo 


22.5 


Control (Path) 3 Temporal Ctx 


9.9 


AD 2 Hippo 


25.9 


Control (Path) 4 Temporal Ctx 


38.2 


AD 3 Hippo 


12.4 


AD 1 Occipital Ctx 


22.2 


AD 4 Hippo 


13.5 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


82.9 


AD 3 Occipital Ctx 


5.3 


AD 6 Hippo 


74.2 


AD 4 Occipital Ctx 


35.4 


Control 2 Hippo 


21.5 


AD 5 Occipital Ctx 


40.9 


Control 4 Hippo 


19.3 


AD 6 Occipital Ctx 


17.7 


Control (Path) 3 Hippo 


8.2 


Control 1 Occipital Ctx 


4.8 


AD 1 Temporal Ctx 


24.3 


Control 2 Occipital Ctx 


53.2 


AD 2 Temporal Ctx 


43.8 


Control 3 Occipital Ctx 


39.2 


AD 3 Temporal Ctx 


4.5 


Control 4 Occipital Ctx 


8.2 


AD 4 Temporal Ctx 


36.6 


Control (Path) 1 Occipital Ctx 


88.3 


AD 5 Inf Temporal Ctx 1 


100.0 


Control (Path) 2 Occipital Ctx 


7.1 


AD 5 Sup Temporal Ctx 


62.0 


Control (Path) 3 Occipital Ctx 2.5 
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AD 6 Inf Temporal Ctx 


74.7 


[rvp -w- s ft b iv*+ n — 

Pnntrnl fPathfil OrrirnfaTf^fv * 

V^wii UUl a. Hi y *+ VyvA»lL/IulI V_AA 


j /.i 


AD 6 Sup Temporal Ctx 


65.1 


Prmtrnl 1 Parietal fr* 




Control 1 Temporal Ctx 


5.8 


Control 2 Parietal Ctx 


77.4 


Control 2 Temporal Ctx 


29.5 


Control 3 Parietal Ctx 


17.1 


Control 3 Temporal Ctx 


22.7 


Control (Path) 1 Parietal Ctx 


77.9 


Control 3 Temporal Ctx 


22.7 


Control (Path) 2 Parietal Ctx 


22.4 


Control (Path) 1 Temporal Ctx 


74.2 


Control (Path) 3 Parietal Ctx 


6.3 


Control (Path) 2 Temporal Ctx 


47.0 


Control (Path) 4 Parietal Ctx j 


51.4 



Table RC> General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag6035, 
Run 


issue Name 


Rel. 

Exp.(%) 
Ag6035, 
Run 


Adipose 


0.5 


XXCllai La. X X\ lv 




Melanoma* Hs688(A).T 


0.0 


jjiauuci 


z.u 


Melanoma* Hs688(B).T 


0.0 






Melanoma* M14 


0.0 


Gastric ca TCATO TTT 




Melanoma* L05QMVI 


0.0 




1 ft 


Melanoma* SK-MEL-5 


0.0 


Colon ca SW480 


14 8 
x*+.o 


Squamous cell carcinoma SCC-4 


0.0 


Colon ca * CSW480 meO SW620 




Testis Pool 


1.3 


Colon ca. HT29 


1.7 


Prostate ca.* (bone met) PC-3 


0.0 


Colon ca.HCT-1 16 


12.7 


Prostate Pool 


4.7 


Colon ca. CaCo-2 


12.3 


Placenta 


2.0 


Colon cancer tissue 


5.3 


Uterus Pool 


2.5 


Colon ca. SW1116 


0.0 


Ovarian ca. OVCAR-3 


3.3 


Colon ca. Colo-205 


3.7 


Ovarian ca. SK-OV-3 


2.8 


Colon ca. SW-48 


3.4 


Ovarian ca. OVCAR-4 


3.8 


Colon Pool 


0.9 


Ovarian ca. OVCAR-5 


7.0 


Small Intestine Pool 


1.5 


Ovarian ca. IGROV-1 


10.4 


Stomach Pool 


2.1 ■ 


Ovarian ca. OVCAR-8 


3.1 


Bone Marrow Pool 


0.5 


Ovary 


1.1 


Fetal Heart 


1.3 | 


Breast ca. MCF-7 


3.7 


Heart Pool 


0.2 


Breast ca. MDA-MB-231 


6.9 


Lymph Node Pool 


0-9 


Breast ca. BT 549 


2.0 


Fetal Skeletal Muscle 


1.4 


Breast ca. T47D 


1.1 


Skeletal Muscle Pool 


23 


Breast ca. MDA-N 


4.3 


Spleen Pool 


3.6 


Breast Pool 


19 


Orymus Pool 


2.8 


Trachea 


3.2 ( 


□S[S cancer (glio/astro) U87-MG ( 


).0 


Lung 


LI < 


^NS cancer (glio/astro) U-118-MG : 


2.8 
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reiai x-urig 


A 1 

4.1 


CNS cancer (neuxo;metj SIGN-AS 


& 3:313?' Ij 


T iino ra 1MPT Mid 17 


0. Q 

o.y 


CN5 cancer (astro) SF-539 


i 


T i\t> n T t/" "1 

LcuUg Ca. J-tA-I 


ia i 


/^\T O - f _ _ j _ \ CTX.1T> 

CJNo cancer (astro) ojNB-75 


[4.6 


T nnrr no MPT WM£ 
JL-LLUg Ca. IN 1*1-1X140 


Q A 

o.4 


/tvt c ______ /_,t? _\ cxtd -f r\ 

CNo cancer (glio) SNB-19 


|4.4 


T unit /»o CUD 77 

jLfUng ca. orur- / / 




UN£> cancer (glio) SF-295 


11.4 


L*ung ca. AjH-y 


1j.3 


Brain (Amygdala) Pool 


|l5.1 


LAing ca. iNd-Jtxj/o 


A O 

4,o 


Brain (cerebellum) 


100.0 


•Lung ca. iNVJhriZj 


J. 1 


T* _ _ T . /^_*_1\ 

Brain (retaJ) 


92.7 


T unn XT/" 1 ! TT/1/"A 

bung ca. In CI- ri4oU 


7.9 


T"> " /TT" x T-k i 

Brain (Hippocampus) Pool 


32.1 


Lung Ca. xiUr -O/ 


a a 
0.0 


Cerebral Cortex Pool 


21.8 


limn- 01 XT/^T TTCO") 

LUugCa. JNCI-rlJZZ 


A A 
0.0 


Bram (Substantia nigra) Pool 


18.4 


Liver 


A C 

0.5 


Bram (Thalamus) Pool 


24.8 


reiai i_»iver 


O A 

2.0 


Brain (whole) 


29.9 


Liver ca. HepG2 


74 


Sninal Card PaaI 




Kidney Pool 


1.6 


Adrenal Gland 


22 


Fetal Kidney 


3.5 


Pituitary gland Pool 


3.7 


Renal ca. 786-0 


2.4 


Salivary Gland 


1.0 


Renal ca. A498 


2.4 


Thyroid (female) 


2.0 1 


Renal ca. ACHN 


11.8 


Pancreatic ca. CAPAN2 


0.0 


Renal ca. UO-31 


6.2 


Pancreas Pool 


0.6 



Table RD. Panel 4.1D 



Tissue Name 


ReL 
Exp.0 
Ag6035, 
Run 

225157775 


Tissue Name 


ReL 

Exp.(%) 
Ag6035, 
Run 

225157775 


Secondary Thl act 


0.0 


HUVEC EL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC 1FN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
ELlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
■f IL-lbeta 


o.o 
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CD45RA CD4 lymphocyte act ' 


0.0 


Coronery artery SMC rest 




CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 
IL-lbeta 


0.0 


CD8 lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CD8 lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL-lbeta 


0.0 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 


v^ju-/*t ljuiuiivwjr to iiuuv 


00 

\S.\J 


KU-812 (Basophil) 
PMA/ionomycin 


0 0 


2ry Thl/Th2/Trl_anti-CD95 
CH11 


0.0 


CCD1 106 (Keratinocytes) none 


0.0 


LAK cells rest 


0.0 


CCD1106 (Keratinocytes) 

TTSTPalt^Tia -i- TT _lh#»tn 


0.0 


JLirVJV LCI la UU~£r 




JLil YCl LIIJXIUMa 






U.\J 


r\ L^i""Jiz.y nunc 




T ATT r»f*11c TT -94-TFTM <ramma 


n n 


NPT M909 TT -J. 




LAK cells 1L-2+ IL-18 


0.0 


NCI-H292IL-9 


0.0 


LAK cells PMA/ionomycin 


0.0 


NCI-H292 IL-13 


0.0 


NK Cells EL-2 rest 


0.0 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 3 day 


10.1 


HPAECnone 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + IL-1 beta 


0.0 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


0.0 


PBMCrest 


5-5 


Lung fibroblast TNF alpha + IL-1 
beta 


0.0 


PBMCPWM 


0.0 


Lung fibroblast IL-4 


0.0 


PBMCPHA-L 


0.0 


Lung fibroblast IL-9 


0.0 


Ramos (B cell) hone 


0.0 


Lung Fibroblast IL-13 


0.0 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070rest 


0.0 




Oft 


Dermal fibroblast CCD1070 TNF 
alpha 


\J.\J 


EOL-1 dbcAMP 


100.0 


Dermal fibroblast CCD1070 IL-1 
beta 


0.0 


EOL-1 dbcAMP 
xiyii\f lunumy Liu 


36.1 


Dermal fibroblast IFN gamma 


0.0 


T^Vpt^ fTfitir* c T»/"vnp» 
JL/CllUnilL CCllb IlOue 


n n 

l/.U 


TV»rma1 fiHrnMact TT /I 
J_/CJ Midi llUx L/UJdoL JJ_<— 








normal T7if>rnHlac|-o rAcl" 


0 0 


Dendritic cells anti-CD40 


0.0 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


9.9 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


0.0 


Macrophages LPS 


0.0 


Thymus 


8.5 


HUVECnone 


0.0 


Kidney 


7.7 


HUVEC starved 


0.0 
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Table RE, Panel 5 Islet 



Tissue Name 


Rel. 

|? vn (Gf„\ 

Ag6035 
Run 

25357828 
4 


Re). 

JC/Xp«l /o) 

Ag603S, 
Run 

30641400 
3 


Tissue Name 


Re). 

Ag6035, 
Run 
2535782 
84 


Rel. 

Ag6035, 
Run 
3064140 
03 


97457_Patient-02go_adipos 
e 


0.0 


0.0 


94709_Ponor 2 AM - 
A_adipose 


0.0 


0.0 


97476JPatient-07sk_skeleta 
1 muscle 


0.0 


0.0 


94710JDonor2AM- 
B_adipose 


0.0 


0.0 


97477J > atient-07ut_uterus 


0.0 


0.0 


9471 lJDonor 2 AM- 
C_adipose 


0.0 


0.0 


97478_Patient-07pLpIacent 
a 


0.0 


0.0 


94712_Donor2AD- 
A_adipose 


0.0 


0.0 


99l67_Bayer Patient 1 


100.0 


100.0 


94713JDonor2AD- 
B_adipose 


0.0 


0.0 


97482„Patient-08uCuterus 


0.0 


0.0 


94714_Donor2AD- 
C_adipose 


0.0 


0.0 


97483_Patient-08pLplacent 
a 


0.0 


0.0 


94742_Donor3U- 
A31esenchymal Stem Cells 


0.0 


0.0 


97486JPatierrt-09sk_skeleta 
1 muscle 


0.0 


0.0 


94743_Donor3U- 
B_Mesenchymal Stem Cells 


0.0 


0.0 


97487„Patient-09utjterus 


0.0 


0.0 


94730_Donor3AM- 
A_adipose 


0.0 


0.0 


97488„Patient-09pl_placent 
a 


0.0 


0.0 


94731_Donor3AM- 
B_adipose 


o.o 


0.0 


97492JPatient-10ut - uterus 


0.0 


0.0 


94732JDonor 3 AM - 
C_adipose 


0.0 


0.0 


97493 J^tient-iOpLpIacent 
a 


0.0 


0.0 


94733_J)onor3AD- 
A_adipose 


0-0 


0.0 


97495 JPatient-1 lgo_adipos 
e ! 


0.0 


0.0 


94734_Donor3AD- 
B_adipose 


0.0 


0.0 


97496JPatient-l lsk_skeleta 
I muscle 


0.0 


o.o 


94735_Donor 3 AD - 
C_adipose 


0.0 


0.0 


97497JPatient-l luUiterus 


0.0 


o.o 


77138JLiver HepG2untreate 
d 


0.0 


0.0 


97498„Patient-l lpl_placent 
a 


0.0 


0.0 


73556_Heart_Cardiac stromal 
cells (primary) 


0.0 


0.0 


97500_Patient-12go_adipos 
e 


0.0 


0.0. 


81735_Small Intestine 


0.0 


0.0' 


97501JPatient-12sk_skeIeta 
1 muscle 


0.0 


0.0 


72409_Kidney_Proximal 
Convoluted Tubule 


0.0 


0.0 


97502 JPatient-12ut_uterus 


0.0 


0.0 


82685_SmaU 
intestine_Duodenum 


0.0 


0.0 


97503_Patient-12pLp]acent 
a 


0.0 


0.0 


90650_Adrenal_Adrenocortic 
a] adenoma 


0.0 


16.2 
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94721_Donor2U- 
A_Mesenchyma] Stem 
Cells 


0.0 


0.0 


j PCT/ti'Bff 

72410_Kidney_HRCE 


1B^3 
0.0 


0.0 


94722 J)onor2U- 
B ^Mesenchymal Stem 
Cells 


0.0 


0.0 


72411JKidney_HRE 


0.0 


0.0 


94723_Donor2U- . 
C_Mesenchymal Stem 
Cells 


0.0 


0.0 


73 139_Uterus_Uterine 
smooth muscle cells 


0.0 j 


0.0 



CNSjneurodegeneration_vl.O Summary: Ag6035 This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 
However, no differential expression of this gene was detected between Alzheimer's 
diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1.5 for a discussion of this gene in treatment of central nervous system disorders. 

GeneraLscreening_paneI_vl.5 Summary: Ag6035 Highest expression of this 
gerte is detected in cerebellum (CT=27). This gene codes for a splice variant of potassium 
channel TREK2. As reported in literature (Bang et aL, 2000, J Biol Chem 275(23): 17412-9, 
PMID: 10747911), this gene shows expression preferentially in all the regions of brain. 
Therefore, therapeutic modulation of this gene product may be useful in the treatment of 
central nervous system disorders such as Alzheimer's disease, Parkinson's disease, epilepsy, 
multiple sclerosis, schizophrenia and depression. 

Moderate to low levels of expression of this gene is also seen in number of cancer 
cell lines derived from brain, colon, gastric, renal, lung, breast and ovarian cancer. 
Therefore, therapeutic modulation of this gene may be useful in the treatment of these 
cancers. 

In addition, low levels of expression of this gene is also seen in tissues with 
metabolic/endocrine functions, including pancreas, adipose, adrenal gland, thyroid, 
pituitary gland, skeletal muscle, heart, liver and the gastrointestinal tract. Therefore, 
therapeutic modulation of the activity of this gene may prove useful in the treatment of 
endocrine/metabolically related diseases, such as obesity and diabetes. 

Panel 4.1D Summary: Ag6035 Highest expression of this gene is detected in 
eosinophils (CT=32.5). Low levels of expression of this gene is also seen in 
PMA/ionomycin treated eosinophils. Therefore, therapeutic modulation of this gene or its 
protein product may useful in the treatment of hematopoietic disorders involving 
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eosinophils, parasitic infections, autoimmune and inflamifiaftfiy'cfis^sfelS W£Mitfgdie¥gf 
and asthma. 

Panel 5 Islet Summary: Ag6035 Two experiments with same probe-primer sets 
are in excellent agreement. Low levels of expression of this gene are restricted to islet cells 
5 (CTs=33-34). This gene codes for a splice variant of potassium channel TREK2. Potassium 
channels play an important role in insulin secretion by islet beta cells upon stimulation by 
glucose. Alteration in the insulin secretion pathway through the use of sulfonylureas or 
genetic inactivation of K(ATP) channels may lead to inappropriate insulin secretion at low 
glucose (Henquin JC, 2000, Diabetes 49(1 1):175 1-60, PMID: 11078440). Therefore, 
10 therapeutic modulation of this gene or its protein product may be useful in the treatment 
type 2 diabetes. 

S. CG146403-01: Diacylglycerol acyltransferase 2. 

Expression of gene CG146403-01 was assessed using the primer-probe set Ag6034, 
described in Table SA. Results of the RTQ-PCR runs are shown in Tables SB, SC and SD. 
15 Table SA. Probe Name Ag6034 



Primers 


Sequence 


Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 f -tggggagaatgacatctttaga-3 ' 


22 _J 


540 


321 


Probe 


TET-5 * -cttaaggcttttgccacaggctcctg 
-3 '-TAMRA 


26 


562 


322 


Reverse 


5 » -agagaagcccatgagcttctt-3 ' |21 


613 


323 



Table SB, General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag6034, 
Run 

228763480 


issue Name 


Rel. 

Exp.(%) 
Ag6034, 
Run 

228763480 


Adipose 


0.2 


Renal ca. TK-10 


27.9 


Melanoma* Hs688(A).T 


0.0 


Bladder 


1.2 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. (liver met.) NCI-N87 


05 


Melanoma* M14 


0.1 


Gastric ca.KATOm 


7.9 


Melanoma* LOXIMVI 


0.0 


Colon ca. SW-948 


3.6 


Melanoma* SK-MEL-5 


0.2 


Colon ca. SW480 


12.5 
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Squamous cell carcinoma SCC-4 jo.0 


Colon ca * fSwffid metTOtfJ^rf"* 


/1137I 
t«" 


1 estis r ool 


|0.2 


Pnlnn ra T4T9Q 




rrostate ca.* (bone met) PC-3 


I 0 ' 4 


Colon ca HCT-116 


on 

v/.v/ 


rrostate Fool 


jo.o 


Colon ca CaCo-9 


inn a 


Placenta 


jo.o 


Colrvn ^ATir'PT ticciiA 

Udliv^CI LiootlC 




Uterus Pool 


jo.o 


Pnlrm r» ^Wlll^ 
\_»U1UI1 t>a. O VY 1 llO 


n n 
u.u 


Ovarian ca. OVCAR-3 


jo.o 


PaIati c» Pnln-90S 




Ovanan ca SK-OV-3 




Cnlnn ra SW-ilR 

W<*. O VV "tO 


jv.U 


Ovanan ca. OVCAR-4 


jo.o 


Pnlnn Ponl 


u.z 


Ovanan ca. OVCAR-5 


0.1 


OlilaJl JJILCollIJC rOOI 




Ovarian ca, IGROV-1 


0.1 


OiVJiilaiii x \jKJl 


u.u 


Ovanan ca. OVCAR-8 


0,2 


x>\jii& iyj.cu iKjVf rooj 




Ovary 


0.0 


jrcial flCdJL 


U.Z 


Breast ca. MCF-7 


0.0 


T-Tr ! f , rf p/-vr\1 

jneaxi rooi 




Breast ca. MDA-MB-231 


0.0 


i~*y liipii ixoue jrooi 


u.u 


Breast ca. BT 549 


0.1 


JTOlcLl OJ\t/iClaI lYXUSCie 


0.1 


Breast ca. T47D 


0.0 


oK.eiec£j xviuscje r ooi 


|U.U 


Breast ca. MDA-N 


p.o 


opieen Jrooi 


U.U 


Breast Pool 


fei 


Thymus Pool 


A t 
0.1 


Trachea 


0.0 


i^ino cancer ^giio/asrxo^ Uo/-JVivjr 


A A 
U.U 


Lung 


0.0 


l^ino cancer vguo/astro; U-llo-Mij 


A A 
U.U 


Fetal Lung 


0.4 


lj.no cancer ^neuro^ey oiv-JN-Ao 


A A 
U.U 


Lungca.NCI-N417 


0.1 


v^iNo cancer ^ascroj orojy 


A O 

u.z 


Lung ca. LX-1 


28.1 


L^LNo cancer vastroy oiNJt>-/o 


A A 
U.U 


Lung ca. NCI-H146 


0.0 


i^rso cancer (gno^ oXNr>-iy 


A A 
U.U 


Lungca. SHP-77 


0.0 


L-iNo cancer ^gno^ ojr-zyj 


A n 
U.U 


Lung ca. A549 


0.7 


r>ram v./\mygaaia^ rooi 


A A 
U.U 


Lung ca. NCI-H526 


0.0 


orain vcereDeiium^ 


A 1 
U.l 


Lung ca. NCI-H23 


0.0 


orain v,*eiai^ 


A ^ 

U.Z 


Lung ca. NCI-H460 


4.2 


Drain ^rjjppocainpus ) Jrooi 


A A 

U.U 


Lung ca. HOP-62 


0.0 


v^cxcoi di v—UilcA irOOI 


fl 1 
U.l 


Lung ca. NCI-H522 


0.2 


Dralll ^OUDSlanUa mgTaJ JT OOI 


U.U 


Liver 


1.7 


Ol alii ^1 lldjdmus J x OOI 


li.U 


Pptal T ivf*i* 

FCtdl l_fIVCI 






1 1 

1.1 


Liver ca. HepG2 


62.9 


Spinal Cord Pool 


3.0 


Kidney Pool 


0.0 


\drenal Gland 


3.0 


Fetal Kidney 


5.1 ) 


Pituitary gland Pool 


3.0 


Renal ca. 786-0 


o.o ; 


Salivary Gland ( 


3.0 


Renal ca. A498 


0.1 


rhyroid (female) 


).0 


Renal ca. ACHN 


O.O 1 


5 ancreatic ca. CAPAN2 ( 


).0 


Renal ca.UO-31 


3.0 1 


J ancreas Pool ( 


).0 
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Table SC. Panel 4.1D 



Tissue Name 


Rel. 
Ep.(%) 
Ag6D34, 
Run 


Tissue Name 


Rel. 

Exp.(%) 
Ag6034, 
Run 


Secondary Thl act 


0.0 


HUVECIL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.4 


HUVEC TNF alpha + IFN gamma 


0.0 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2rest 


0.0 


HUVEC 1L-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


PnmaTV TTi 1 art" 
x kxikiaiy a ill a\^l 




Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


Microvascular Dermal EC none 


0.0 


xiuilaiy 111 dll 


n ft 


Microsvasular Dermal EC 
TNFalpha + EL-1 beta 


o.u 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
ILlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium none 


0.0 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
-f IL-lbeta 


0.0 


CD45RA CD4 lymphocyte act 


0.0 


Coronery artery SMC rest 


0.0 




n ft 


Coronery artery SMC TNFalpha + 
IL-lbeta 


U.O 


CD8 lymphocyte act 


0.0 


Astrocytes rest 


oTo 


Secondary CD8 lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL-lbeta 


0.0 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 | 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


0.0 


2ry Thl/Th2/Trl_anti-CD95 

v_^A Jl 1 J, 


0.0 


CCD1 106 (Keratinocytes) none 


0.0 


LAK cells rest 


0.0 


va^d 1 1 uo ^iveraLmocyxes ) 
TNFalpha + IL-lbeta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


17.0 


LAK cells IL-2+DL-12 


0.0 


NCI-H292 none 


0.0 


LAK cells IL-2+IFN gamma 


0.0 


NCI-H292IL-4 


3.0 


LAK cells IL-2+ IL-18 


0.0 


NCI-H292 IL-9 


3.0 


LAK cells PMA/ionomycin 


0.0 


NCI-H292 IL-13 


3.0 


NK Cells IL-2 rest 


D.0 


NCI-H292 IFN gamma 


3.0 


Two Way MLR 3 day 


3.0 


HPAECnone ( 


3.0 


Two Way MLR 5 day 


3.0 


HPAEC TNF alpha + IL-1 beta ( 


3.0 | 


Two Way MLR 7 day 


3.0 ] 


Lung fibroblast none ( 


3.0 


PBMC rest 


,.0 


Lung fibroblast TNF alpha + IL-1 
>eta K 


).0 
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• 

X JDIVIU, XT W1VX 




- LungfibrQb eCT/-ysosL 

juung ijoroDiasi u-.-^ 


f\ urtM «ijJ» %i •« 

u.u 


PBMC PHA-L 


0.0 


Lung fibroblast EL-9 


0.0 


Ramos (B ceU) none 


o.o 


Lung fibroblast EL-13 


0.0 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast DFN gamma 


0.0 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 rest 


0.0 


B lymphocytes CD40L and IL-4 


0.0 


Dermal fibroblast CCD1070 TNF 
alpha 


0.0 


EOL-1 dbcAMP 


o.o 


Dermal fibroblast CCD1070 IL-l 

Deia 


0.0 


PHT 1 AAtTD 
HyJLi-L uDCAJVLr 

PMA/ionomycin 


).0 


Dermal fibroblast IFN gamma 


0.0 


Dendritic cells none 


).0 


Dermal fibroblast IL-4 


0.0 


Dendritic cells LPS ( 


).0 


Derma] Fibroblasts rest 


0.3 


Dendritic cells anti-CD40 ( 


).5 


Neutrophils TNFa+LPS 


4.0 


Monocytes rest ( 


).0 


Neutrophils rest 


o.o 


Monocytes LPS ( 


).0 


Colon 


81.2 


Macrophages rest ( 


).0 


Lung 


4.7 


Macrophages LPS ( 


).0 


Thymus 


18.0 


HUVEC none ( 


).0 


Kidney 


100.0 


HUVEC starved ( 


).0 






Table SIX Panel 5 Islet 


Tissue Name 


Rel. 

Exp.(%) 

Ag603, 

Run 

256791126 


Tissue Name 


Rel. 

Exp.(%) 
Ag6034, 
Run 

256791126 


97457J > atient-02go„adipose 


0.0 


?4709_Donor 2 AM - A_adipose 


0.0 


97476JPatient-07sls^skeletal 
muscle 


0.0 


?4710_J>onor 2 AM - B_adipose 


0.0 


97477 JPatient-£7uUiterus 


0.0 


?4711_Donor 2 AM - C_adipose 


0.0 


97478_Patient-07pLplacenta 


0.0 


H712JDonor 2 AD - A_adipose 


0.0 


99167 JBayer Patient 1 


0.0 


?4713_Donor 2 AD - B_adipose 


0.0 


97482 J^tient-OSuUuterus 


0.0 


H714_Donor 2 AD - Cjadipose 


0.0 


97483JPatient-08pU>lacenta 


0.0 


H742_Donor 3 U - AJvlesenchymal 
>tem Cells 


0.0 


97486__Patient-09sle_skeletal 
muscle 


0.0 ; 


H743 JDonor 3 U - BJdesenchymal 
>tem Cells 


0.0 


97487 JPauent-09ut_uterus 


0.0 < 


W30 JDonor 3 AM - A_adipose 


0.0 


97488_Patient-09pLplacenta 


0.0 < 


>4731_Ponor 3 AM - B_adipose 


0.0 


97492_Patient-10ut_uteru5 


0.0 < 


>4732 JDonor 3 AM - C_adipose 


0.0 


97493 JPatient-lOpLpIacenta 


0.0 s 


>4733JDonor 3 AD - A_adipose 


0.0 


97495 JPatient-1 lgo_adipose 


0.0 s 


>4734__Donor 3 AD - B_adipose jO.O 
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97496_Patiei3t-l lsk_skeletal 
muscle 


0.0 


i PCT/OSOB/ 

94735 JDonor 3 AD - C_adipose 


31373 
0.0 


97497 JPatient-1 lut_uterus 


0.0 


77 1 38_Li ver_HepG2untreated 


100.0 


97498J > atient-l lpLplacenta 


0.0 


73556 JHfeart_Cardiac stromal cells 
(primary) 


0.0 


97500J > atient-12go_adipose 


0.0 


81735_SmaIl Intestine 


25.5 


97501 JPatient-12sk,_skeletal 
muscle 


0.0 


72409_Kidney_Proximal Convoluted 
Tubule 


0.0 


97502 JPatient-12ut_uterus 


0.0 


82685_Small intestineJDuodenum 


31.2 


97503 Patient- 12nl Dlacenta 


0.0 


90650_Adrenal_Adrenocortical 
adenoma 


00 

\J.\J 


94721_Donor2U- 
A__Mesenchymal Stem Cells 


0.0 


72410JKidneyJHRCE 


0.0 


94722 JDonor2U- 
BJMesenchymal Stem Cells 


0.0 


72411_Kidney_HRE 


0.0 


94723 JDonor2U- 
CJMesenchyrnal Stem Cells 


0.0 


73139 JCJterusJJterine smooth 
muscle cells 


0.0 



CNSjneurodegeneration_vl.O Summary: Ag6034 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). (Data not shown.) 

5 General jscreening_panel_vl.5 Summary : Ag6034 Highest expression of this 
gene is seen in colon cancer (CT=26.3). High to moderate levels of expression are also seen 
in colon, renal, liver and lung cancer cell lines, as well as in fetal lung. This expression 
suggests that this gene may be involved in these cancers. Thus, expression of this gene 
could be used to differentiate between these samples and other samples on this panel and as 
10 a marker of these cancers. Therapeutic modulation of the expression or function of this 
gene may also be useful in the treatment of these cancers. 

Panel 4.1D Summary: Ag6034 Expression of this gene is highest in colon and 
kidney (CTs=30). Thus, expression of this gene could be used as a marker of these tissues. 
Panel 5 Islet Summary: Ag6034 Highest expression of this gene is seen in a liver 
15 cell line (CT=30.6). Thus, expression of this gene could be used to differentiate between 
this sample and other samples on this panel. 

T. CG146513-01: Diacylglycerol acyltransferase 2. 

Expression of gene CG146513-01 was assessed using the primer-probe set Ag6036, 
described in Table TA. Results of the RTQ-PCR runs are shown in Table TB. 
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Table TA. Probe Name Ag6036 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 • -tggaccctatggaagtatttcc-3 1 


22 


326 


324 


Probe 


TET-5 ' -ttcccagtacagctggtgaagactca 
-3 '-TAMRA 


26 


356 


325 


Reverse 


5 ■ -gttgtgtttgggagaaagatca-3 1 


22 


382 


326 



Table TB. Panel 5 Islet 



Tissue Name 


ReL 

Exp.(%) 

Ag603, 

Run 

279370869 


Tissue Name 


Rel. 

Exp.(%) 
Ag6036, 
Run 

279370869 


y / jrauent-uzgo_adipose 


10.5 


94709_Donor 2 AM - A_adrpose 


ll.4 


y /4 / o^auent-u /sk_sKeietaI 
muscle 


0.0 


94710J)onor 2 AM - B_adipose 


6.7 


y 1^1 /_ ratienmj /ut__uterus 


3.3 


9471 l_Donor2 AM - C_adipose 


4.2 


y /*t /o__raueni-u /pi__piacenta 


o.O 


947 1 2_Donor 2 AD - A_adipose 


23.8 


99167 Jayer Patient 1 


3.3 


94713JDorior 2 AD - B_adipose 


32.8 


y / HO^_jraueniHJoui_uierilS 


Z.O 


94714JJonor 2 AD - C_adipose 


22.2 


97483JPatient-08pI_placenta 


1.0 


94742JDonor 3 U - AJMesenchymal 
Stem Cells 


2.6 


97486_Patient-09sk_skeletal 
muscle 


8.4 


94743_Donor 3 U - B Jdesenchyrnal 
Stem Cells 


2.5 


97487J > atient-09ut_uterus 


5.8 


94730 JDonor 3 AM - A_adipose 


12.9 


97488J>atient-09pLpIacenta 


2.2 


9473 lJDonor 3 AM - B_adipose 


21.0 


97492J > atient-10ut„uterus 


to 


94732JDonor 3 AM - C_adipose 


20.4 


97493_JPatient-10pl_pIacenta 


3.2 


94733_Donor 3 AD - A_adipose 


26.4 


97495_Patient-l lgo_adipose 


6.0 


94734_Donor 3 AD - B_adipose 


25.5 


97496JPatient-l lsk_skeletal 
muscle 


20.2 


94735 JDonor 3 AD - C_adipose 


6.5 


97497_PatienM lut_uterus 


8.7 


771 38JUverJIepG2untreated 


41. 5 


97498JPatient-l lpLplacenta 


1.9 


73556_Heait_Cardiac stromal cells 
(primary) 


1.6 


97500JPatient-12go_adipose 


4.0 


8l735_Small Intestine 


10.7 


97501JPatient-12sk_skeletal 
muscle 


22.2 


72409_KidneyJProximal Convoluted 
Iubule 


100.0 


97502JPatient-12ut_uterus 


7.1 


82685jSmall intestineJDuodenum 


15.7 


97503 JPatient-12pLplacenta 


1.3 


?0650_Adrenal_Adrenocortical 
idenoma 


5.0 
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94721 JDonor2U- 
AJMesenchymal Stem Cells 


12.8 


72410JCidney_HRCE 


*1 ^g , >"-;: 
31.2 




94722JDonor2U- 
BJtfesenchymal Stem Cells 


6.8 


72411JCidney - _HRE 


9.1 




94723JDonor2U- 
C_Mesenchymal Stem Cells 


11.2 


73139JJterusJ(Jterine smooth 
muscle cells 


13.3 





CNS_neurodegeneration_vl.O Summary: Ag6036 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

General_screeiung_panel_vl.5 Summary: Ag6036 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag6036 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

Panel 5 Islet Summary: Ag6036 Highest expression of this gene is seen in a 
kidney derived sample (CT=29.5). Moderate levels of expression are seen in many samples 
on this panel, including samples from uterus, placenta, adipose, and skeletal muscle. Thus, 
this gene may be involved in diseases of these tissues, including obesity and diabetes. 

U. CG146522-01: Diacylglycerol acyltransferase 2. 

Expression of gene CG146522-01 was assessed using the primer-probe set Ag6037, 
described in Table UA. Results of the RTQ-PCR runs are shown in Table UB. 
Table UA. Probe Name Ag6037 



Primers 


Sequence 


Length 


Start 
Position 


SEQH) 
No 


Forward 


5 ■ -attccaagcagcctagtcactt-3 ■ 


22 | 


49 


327 


Probe 


TET-5 » -ttctgcagtggcctttgagctacctt 
-3 ' -TAMRA 


26 


85 


328 


Reverse 


5 » -cagcaggtagacgaacaagatg-3 1 


22 


113 


329 
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Table UB. Panel 5 Islet 



Tissue Name 


Rel. 

Exp.0 

Ag6037, 

Run 

279370870 


Tissue Name 


Rel. 

Exp.(%) 
Ag6037, 
Run 

279370870 


97457 JPatient-02go_adipose 


0.0 


94709 JDonor 2 AM - A_adipose 


0.0 


97476 JPatient-OTsk^skeletal 
muscle 


0.0 


94710JDonor 2 AM - B_adipose 


0.0 


97477 JPatient-07ut_uterus 


0.0 


9471 i_Donor 2 AM - C_adipose 


0.0 


97478JPatient-07pLplacenta 


0.0 


94712JDonor 2 AD- A_adipose 


0.0 


99167J8ayer Patient 1 


0.9 


947 1 3 J3onor 2 AD - B_adipose 


6.0 


97482 JPatient-08ut_uterus 


0.8 


94714_Donor 2 AD - C_adipose 


0.0 


97483 J>atient-08pLplacenta 


0.0 


94742 JDonor 3 U - A_Mesenchymal 
Stem Cells 


0.0 


97486JPatient-09sk_skeletal 
muscle 


9.0 


94743_Donor 3 U - B Jtf esenchymal 
Stem Cells 


0.0 


97487JPatient-09ut_jiteras 


2.2 


94730_Donor 3 AM - A_adipose 


0.0 


97488_Patient-09pLplacenta 


0.0 


9473 lj)onor 3 AM - B_adipose 


0.0 


97492JPatient-10ut_uterus 


0.5 


94732_Donor 3 AM - C_adipose 


0.0 


97493 J'atient-lOpLplacenta 


3.5 


94733JDonor 3 AD - A_adipose 


0.0 


97495 J?atient-1 lgo_adiposc 


1.2 


94734_Donor 3 AD - B_adipose 


0.9 


97496_Patient-l lsk^skeletal 
muscle 




y4/3D_jJonor d au - c_aaipose 


n a 
U.U 


97497JPatient-l lut_uterus 


0.0 


77138_Liver_HepG2untreated 


0.0 


97498_Patient-l lpljlacenta 


0.0 


73556_Heart_Cardiac stromal cells 
(primary) 


0.0 


97500_Patient-12go_adipose 


1.7 


81735_Small Intestine 


1.0 


9750 LPatient- 1 2sk_skeletal 
muscle 


inn a 


72409 JKictoeyJProximal Convoluted 
Tubule 


0.0 


97502 JPatient-12utjiterus 


0.0 


82685_Small mtestineJDuodenum 


0.0 


97503 J > atient-12pLplacenta 


1.0 


90650_AdrenaLAdrenocortical 
adenoma 


0.0 r 


94721_Donor2U- 
A_Mesenchymal Stem CeDs 


o.o 


72410_KidneyJHRCE 


5.0 


94722JDonor2U- 
B_Mesenchymal Stem Cells 


3.0 


72411JKidneyjKRE 


).0 


94723JDonor2U- 
C_Mesenchymal Stem Cells 


3.5 


73 139 JLJterus JJterine smooth , 
nuscle cells 


).0 



CNS_neurodegeneration_vl.O Summary: Ag6037 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 
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General_screening_panel_vl.5 Summary: AgOT^J^pr^itjSWftMs gfttfri?^ 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag6037 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

Panel 5 Islet Summary: Ag6037 Expression of this gene is limited to skeletal 
muscle (CTs=30-31). Thus, expression of this gene could be used to differentiate these 
samples from other samples on this panel and as a marker of this tissue. Furthermore, 
therapeutic modulation of the expression or function of this gene may be useful in the 
treatment of metabolic disorders, including obesity and diabetes. 

V. CG146531-01: DIACYLGLYCEROL ACYLTRANSFERASE 

2. 



Expression of gene CG146531-01 was assessed using the primer-probe set Ag6038, 
described in Table VA. 

Table VA. Probe Name Ag6038 



Primers 

■ 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -aaggtgtcacaggaagagcat-3 ' 


21 


10 


330 


Probe 


TET-5 1 -agccaggtcaccatggctttcttct- 
3 ' -TAMRA 


25 


49 


331 


Reverse 


5 • -gccctcctggagattcagt-3 ■ 


19 |78 


332 



CNS_neurodegeneration_vl.O Summary: Ag6038 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

General_screening_paneLvl.5 Summary: Ag6038 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag6038 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

Panel 5 Islet Summary: Ag6038 Expression of this gene is low/undetectable in all 
samples on thi s panel (CTs>35). 

W. CG147274-01: Protease. 
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Expression of gene CG147274-01 was assessed uHifg-tKe^M^ 
described in Table WA. 

Table WA> Probe Name Ag5623 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 * -gatgtgctgccttcagaatg-3 ' 


20 


64 


333 


Probe 


TET-5 ' -aatcctcccggcctccttggagt-3 
' -TAMRA 


23 


89 


334 


Reverse 


5 ' -gtccttcctgggtgtcttg-3 ' 


19 


121 


335 



CNS_neurodegeneration_vl.O Summary: Ag5623 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

General_screening_panel_vl.5 Summary: Ag5623 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag5623 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

X. CG147419-01: GLUTAMINE: FRUCTOSE-6-PHOSPHATE 
AMIDOTRANSFERASE 1 MUSCLE. 

Expression of gene CG147419-01 was assessed using the primer-probe set Ag5207, 
described in Table XA. Results of the RTQ-PCR tuns are shown in Tables XB, XC, XD 
andXE. 

Table XA. Probe Name Ag5207 



Primers 


Sequenes 


Length 


Start 
Position 


[SEQID 
No 


Forward 


5 1 -gccctctgttgattggtgta-3 * 


20 


736 


336 


Probe 


TET-5 ' -cggagtgaacataaactttctactgat 
ca-3 ' -TAMRA 


29 


756 


33.7 


Reverse 


5 ■ -ccaatctgagtcctagctgttc-3 ' 


22 


802 


338 
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Table XB.CNS neurodegeneration vl.O 



Tissue Name 


ReL 

Exp.(%) 
Ag5207, 
Run 

226559656 


issue Name 


ReL 

Exp.(%) 
Ag5207, 
Run 

226559656 


AD 1 Hippo 


11.3 


Control (Path) 3 Temporal Ctx 


2.3 


AD 2 Hippo 


14.6 


Control (Path) 4 Temporal Ctx 


54.7 


AD 3 Hippo 


0.0 


AD 1 Occipital Ctx 


1.8 


AD 4 Hippo 


6.3 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


100.0 


AD 3 Occipital Ctx 


1.7 


AD 6 Hippo 


29.3 


AD 4 Occipital Ctx 


11.5 


Control 2 Hippo 


59.0 


AD 5 Occipital Ctx 


21.0 


Control 4 Hippo 


0.0 


AD 6 Occipital Ctx 


97.9 


Control (Path) 3 Hippo 


1.8 


Control 1 Occipital Ctx 


0.0 


AD 1 Temporal Ctx 


12.5 


Control 2 Occipital Ctx 


100.0 


AP 2 Temporal Ctx 


41.5 


Control 3 Occipital Ctx 


13.3 


AD 3 Temporal Ctx 


2.2 


Control 4 Occipital Ctx 


2.2 


AD 4 Temporal Ctx 


24.1 


Control (Path) 1 Occipital Ctx 


100.0 


AD 5 Inf Temporal Ctx 


65.5 


Control (Path) 2 Occipital Ctx 


7.2 


AD 5 SupTemporal Ctx 


29.1 


Control (Path) 3 Occipital Ctx 


0.0 


AD 6 Inf Temporal Ctx 


26.2 


Control (Path) 4 Occipital Ctx 


18.9 


AD 6 Sup Temporal Ctx 


49.3 


Control 1 Parietal Ctx 


2.5 


Control 1 Temporal Ctx 


0.0 


Control 2 Parietal Ctx 


53.2 


Control 2 Temporal Ctx 


88.3 


Control 3 Parietal Ctx 


21.6 


Control 3 Temporal Ctx 


19.5 


Control (Path) 1 Parietal Ctx 


94.6 


Control 4 Temporal Ctx 


4.9 


Control (Path) 2 Parietal Ox 


16.8 


Control (Path) 1 Temporal Ctx 


97.3 


Control (Path) 3 Parietal Ctx 


4.0 


Control (Path) 2 Temporal Ctx 


48.0 


Control (Path) 4 Parietal Ctx 


50.3 



Table XC. General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5207, 
Run 

228757767 


issue Name 


Rel. 

Exp.(%) 
Ag5207, 
Run 

228757767 


Adipose 


9.9 


Renal ca. TK-10 


2.9 


Melanoma* Hs688(A).T 


4.0 


Bladder 


2.2 


Melanoma* Hs688(B).T 


12.1 


Gastric ca. (liver met) NCI-N87 


23.2 


Melanoma* M14 


4.1 


Gastric ca. KATO EI 


17.4 


Melanoma* LOXMVI 


0.7 


Colon ca. SW-948 


r 0.4 
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Melanoma* SK-MEL-5 jl.8 


Colon ca. S^SS T 7 ' U 5 ° 5 ' 


Squamous cell carcinoma SCC-4 |0.7 


Colon ca.* (SW480 met) SW620 jo. 1 


Testis Pool |2.8 


Colon ca. HT29 


0.3 


Prostate ca.* (bone met) PC-3 . |6.3 (Colon ca.HCT-1 16 


0.2 


Prostate Pool |4. 1 JColon ca. CaCo-2 


1.6 


Placenta 


0.2 jColon cancer tissue 


1.3 


uterus Jtooj 


5.6 jColon ca.SW1116 


0.0 


uvananca. UVCAK-J 


0.2 |Colon ca. CoIo-205 


2.6 


uvanan ca. oJv-UVo 


5.9 Colon ca. SW-48 


0.8 


Ovarian ca. OVCAR-4 


L2 _jColon Pool 


11.3 


Ovarian ca. OVCAR-5 


1.6 jSmall Intestine Pool 


4.2 


Ovarian ca. IGROV-1 


0.8 jStomachPool 


2.9 


Ovarian ca. OVCAR-8 


1.7 


Bone Marrow Pool 


2.1 


Ovary 


0.7 


Fetal Heart 


45.7 


Breast ca.MCF-7 


0.3 


Heart Pool 


38.2 


Breast ca. MDA-MB-231 


3.8 


Lymph Node Pool 


11.3 


Breast ca. BT 549 


1-3 


Fetal Skeletal Muscle 


19.3 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool (lOO.O 


Breast ca. MDA-N 


0.2 


Spleen Pool jo.5 


Breast Pool 


6.4 


Thymus Pool 


4.0 


Trachea jl.O 


CNS cancer (glio/astro) U87-MG 


11.0 


Lung JL5 


CNS cancer (glio/astro) U-l 18-MG 


24.0 


Fetal Lung 


1.2 


CNS cancer (neuro;met) SK-N-AS 


3.4 


Lungca.NCI-N417 


0.7 


CNS cancer (astro) SF-539 


1.0 


Lung ca. LX-1 


0.6 


CNS cancer (astro) SNB-75 


1.4 


Lung ca. NCI-H146 


0.5 


CNS cancer (glio) SNB-19 


1.2 


Lung ca. SHP-77 


0.4 


CNS cancer (glio) SF-295 


18.6 


Lung ca. A549 


4.8 


Brain (Amygdala) Pool 


3.7 


Lungca. NCI-H526 


0.6 


Brain (cerebellum) 


4.6 


Lungca.NCI-H23 


0.2 


Brain (fetal) 


0.2 


Lungca. NCI-H460 


3.2 


Brain (Hippocampus) Pool 


3.1 


Lungca.HOP-62 | 


4.3 


Cerebral Cortex Pool 


6.7 


Lung ca. NCI-H522 


2.0 


Brain (Substantia nigra) Pool 


13 


Liver 


0.1 


Brain (Thalamus) Pool 


3.2 


Fetal Liver 


14 


Brain (whole) 


XA 


liver ca. HepG2 


3.4 


Spinal Cord Pool 


1.2 


Kidney Pool 


14.4 


Adrenal Gland ; 


16 


Fetal Kidney 


).2 1 


Pituitary gland Pool J 


1.5 


Renal ca. 786-0 


1.2 ! 


Salivary Gland ( 


).4 


Renal ca. A498 ] 


12 : 


rhyroid (female) ( 


).4 


Renal ca. ACHN J 


1.6 I 


^creatic ca. CAPAN2 7 


L8 


Renal ca.UO-31 ] 


.5 I 


5 ancreas Pool t 
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Table XD> Panel 4.1D 



Tissue Name 


ReL 

Ep.(%) 

Ag5207, 

n,. n 
XV II 11 

229739304 


Tissue Name 


ReL 

Exp.(%) 
Ag5207, 

Run 

229739304 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


16.0 


oeconaary 1 nz act 




nu vjtiv^ JuriN gamma 




Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


3.5 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


5.5 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


7.1 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


5.6 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


5.6 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha + 
JLLlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium none 


5.8 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ IL-lbeta 


3.6 


CD45RA CD4 lymphocyte act 


35.6 


Coronery artery SMC rest 


7.4 


CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 
IL-lbeta 


13.6 


CD8 lymphocyte act 


7.8 


Astrocytes rest 


0.0 


Secondary CD8 lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL-lbeta 


12.9 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


10.7 


KU-812 (Basophil) 
jtivia/ lonomycin 


0.0 


zry 1 til/ 1 nifx r jL_anu-uu!/3 
CH11 


0.0 


CCD1106 (Keratinocytes) none 


18.6 


LAK cells rest 


0.0 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


0.0 


LAK cells IL-2 


0.0 


Liver cirrhosis 


0.0 


LAK cells IL-2+1L-12 


0.0 


NCI-H292none 


0.0 


LAK cells IL-2+IFN gamma 


0.0 


NCI-H292IL^ 


0.0 


LAK cells IL-2+ IL-18 


0.0 


NCI-H292 IL-9 


0.0 


LAK cells PMA/ionomycin 


6.0 


NCI-H292 EL-13 


0.0 


NKCeUs IL-2 rest 


4.8 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 3 day 


0.0 


HPAEC none 


4.7 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + IL-l beta 


20.9 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


17.7 


PBMCrest 


7.0 


Lung fibroblast TNF alpha + IL-1 
beta 


23.0 
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PBMCPWM jo.Q 


Lung fibroblast IL-4 




3! 


PBMCPHA-L Jo.O 


Lung fibroblast IL-9 


11.3 




Ramos (B cell) none |o.O 


Lung fibroblast IL-13 


9.2 




Ramos (B cell) ionomycin 0.0 


Lung fibroblast IFN gamma 


33.0 




B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 rest 


41.8 


B lymphocytes CD40L and IL-4 


0.0 


Dermal fibroblast CCD1070 TNF 
alpha 


100.0 




EOL-1 dbcAMP 


0.0 


Dermal fibroblast CCD1070 IL-1 
beta 


77.9 


EOL-I dbcAMP 
PMA/ionomycin 


0.0 


Dermal fibroblast IFN garnma 


7.6 


Dendritic cells none 


5.1 


T^PMTTlfll "f? Kr'/'kKI o of- TT /I 


15.3 


Dendritic cells LPS 


0.0 


jl^ciiimi jriOTODiasts rest 


34.6 


Dendritic cells anti-CD40 


0.0 


iNeuiropniis iryJra+JUro 


4.8 


Monocytes rest 


6.6 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.O 


Macrophages rest p.O 


Lung 


12.3 


Macrophages LPS |0.0 


rhymus 


D.O 


HUVECnone j&p 


Sidney 


3.0 


HUVEC starved |2^5 







Table XE. Panel 5 Islet 



Tissue Name 


Kel. 
Exp.0 
Ag5207, 
Run 

263594763 


Tissue Name 


Rel. 

Exp.(%) 
Ag5207, 
Run 

263594763 


97457_Patient-02go_adipose 


2.0 


94709_Donor 2 AM - A_adipose 


4.6 


97476J>atient-07sk_skeletal 
muscle 


3.1 


94710JDonor 2 AM r B_adipose 


1.1 


97477_Patient-07uL.uterus 


3.2 


94711 Donor 2 AM -C adipose - 


08 


97478_Patient-07pU>lacenta 


2.0 


94712_Donor 2 AD - A_adipose 


1.0 


99167_Bayer Patient 1 


1.0 


94713_Donor2AD-B adipose 


8.1 


97482_Patient-08ut_uterus 


6.7 


94714 Donor 2 AD -C adipose 


5.3 


97483_Patient-08pl_j>lacenta 


0.0 


94742 J>onor 3 U - Ajvresenchymal 
Stem Cells 


1.2 


97486_Patient-09sk_skeletal 
muscle 


27.4 


94743 JDonor 3 U - B JVIesenchymal 
Stem Cells 


3.7 


97487_Patient-09ut_uterus 


12.4 


94730_Donor 3 AM - A_adipose 


4.6 


97488 JPatient-09pLplacenta 


1.3 


94731_Donor 3 AM - B_adipose 


2.1 


97492JPatient-10ut_utenis 


14.4 


?4732_Donor 3 AM - C.adipose 


1.0 


97493_Patient-10pl_pIacenta 


1.1 


?4733 Donor 3 AD - A adipose < 


5.9 


97495_Patient-llgo_adipose ; 


1.0 < 


J4734 Donor 3 AD -B adipose J- 


5.2 
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97496_Patient-l lsk_skeletal 
muscle 


50.3 




I A A 

}4.4 


97497_Patient-l lut_uterus 


7.1 


77138JLiver_HepG2untreated ~~ 


IT" 


97498JPatient-l lppplacenta 


0.0 


73556Jfleart_Cardiac stromal cells 
(primary) 


2.2 


97500JPatient-12go_adipose 


10.7 


81735_SmaIl Intestine 


7.1 


97501JPatient-12sk_skeletal 
muscle 


100.0 


72409_Kidney_Proximal Convoluted 
Tubule 


0.0 


97502_j>atient-12ut_utenis ^ 


10.9 


82685_SmalI intestineJDuodenum 


0.0 


97503JPatient-12pLplacenta \ 


0.0 


90650_Adrenal_Adrenocortical 
adenoma 




94721JDonor2U- 
AJVIesenchymal Stem Cells 


1.8 


72410JKidney_HRCE 


4.9 


94722JDonor2U- 
BJtfesenchymal Stem Cells 


1.0 


72411_Kidney_HRE 


0.0 


94723_JDonor2U- 
C_Mesenchymal Stem Cells 


3.5 


73139JJterus_Uterine smooth 
muscle cells 


*.o 



CNS_neurodegeneration_vl.O Summary: Ag5207 This panel does not show 
differential expression of this gene in Alzheimer's disease. However, this profile confirms 
the expression of this gene at moderate levels in the brain. Please see Panel 1.5 for 
discussion of this gene in the central nervous system. 

General_screemng_panel_vl.5 Summary: Ag5207 Highest expression of this 
gene is seen in skeletal muscle (CT=28). Lo.w but significant expression is also seen in 
pancreas, adrenal, pituitary, adipose, adult and fetal heart, and fetal skeletal muscle. This 
gene encodes a protein that is homologous to Glutamine:fructose-6-phosphate 
amidotransferase (GFAT) which catalyzes the formation of glucosamine 6-phosphate and is 
the first and rate-limiting enzyme of the hexosamine biosynthetic pathway. Enhanced 
glucose flux via the hexosamine biosynthetic pathway has been implicated in in the 
induction of insulin resistance. Buse et al. showed in a mouse model that glucose flux via 
the hexosamine pathway is selectively increased in muscle and may contribute to muscle 
insulin resistance in non-insulin-dependent diabetes mellitus. (Am J Physiol 1997 
Jun;272(6 Pt l):E1080-8). Thus, based on the homology of this enzyme to GFAT and the 
high expression in muscle, modulation of the expression or function of this gene may be 
useful in the treatment of type II diabetes. 

This gene is widely expressed on this panel with moderate to low expression seen 
throughout the CNS, including the hippocampus, thalamus, substantia nigra, amygdala, 
cerebellum and cerebral cortex. Therefore, therapeutic modulation of the expression or 

416 



WO 03/029424 



PCTYUS02/31373 



function of this gene may be useful in the treatment ot neurOJogitai'lliWaeTS; 'stfCh"^ 
Alzheimer's disease, Parkinson's disease, schizophrenia, multiple sclerosis, stroke and 
epilepsy. 

Moderate to low levels of expression are also seen in many cancer cell lines on this 
5 panel, including gastric cancer and melanoma cell lines. Thus, modulation of this gene 
product may be useful in the treatment of cancer. 

Panel 4.1D Summary: Ag5207 Detectable levels of expression appear to be 
restricted to TNF-alpha treated dermal fibroblasts (CT=33.3). This expression suggests that 
this gene product may be involved in skin disorders, including psoriasis. • 
10 Panel 5 Islet Summary: Ag5207 Highest expression is seen in skeletal muscle 

(CT=30.2), in agreement with panel 1.5. Moderate to low levels of expression are also seen 
in other metabolic tissues, including uterus and adipose. Please see Panel 1.5 for discussion 
of this gene in metabolic disease. 

Y. CG148102-01: CARNITINE 
15 O-PALMITOYLTRANSFERASE I. 

Expression of gene CG1481 02-01 was assessed using the primer-probe set Ag5274, 
described in Table YA. Results of the RTQ-PCRruns are shown in Tables YB, YC, YD 
and YE. 

Table YA. Probe Name Ag5274 

20 



Primers 




Length 


Start 
Position 


SEQBD 
No 


Forward 


5 1 -cacttccgggacccacagt-3 ■ 


19 


1732 


339 


Probe 


TET-5 • -caccaggctctgctgaaggcagcc- 
3 ' -TAMRA 


24 


1783 


340 


Reverse 


5 • -caaacaggtggcggtcaact-3 ' 


20 


1821 


341 < 



Table YB.CNS neurodegeneration vl.O 

25 





Rel. 




Rel. 




Exp.(%) 




Exp.(%) 


Tissue Name 


Ag5274, 


issue Name 


Ag5274, 




Run 




Run 




230512893 




230512893 
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AD 1 Hippo 


19.3 


Control (Pa^V^3W g l7.? 137,: 


f 


AD 2 Hippo 


33.2 


Control (Path) 4 Temporal Ctx |29.7 




AD 3 Hippo 


11.7 


AD 1 Occipital Ctx |l8.3 


AD 4 Hippo 


9.9 


AD z uccipitai ctx ^Missmg) 


0.0 


AD 5 Hippo 


95.9 


AO d Uccipitai (_tx 


7.5 


AD 6 Hippo 


43.5 


AD 4 Occipital Ctx 


15.1 


Control 2 Hippo 


57.0 


AD 5 Occipital Ctx 


66.4 


Control 4 Hippo 


11.9 


AD 6 Occipital Ctx 


13.1 


Control (Path) 3 Hippo 


8.5 


Control 1 Occipital Ctx 


3.7 


AD 1 Temporal Ctx 


17.0 


Control 2 Occipital Ctx 


98.6 


AD 2 Temporal Ctx 


29.5 


Control 3 Occipital Ctx 


27.5 


AD 3 Temporal Ctx 


8.3 


Control 4 Occipital Ctx 


4.5 


AD 4 Temporal Ctx 


19.6 


Control (Path) 1 Occipital Ctx 


100.0 


AD 5 M Temporal Ctx 


95.9 


Control (Path) 2 Occipital Ctx 


17.1 


AD 5 Sup Temporal Ctx 


53.6 


Control (Path) 3 Occipital Ctx 


3.8 


AD 6 Inf Temporal Ctx 


29.9 


Control (Path) 4 Occipital Ctx 


20.0 


AD 6 Sup Temporal Ctx 


33.2 


Control 1 Parietal Ctx 


10.5 


Control 1 Temporal Ctx 


8.4 


Control 2 Parietal Ctx 


49.3 


Control 2 Temporal Ctx 


70.2 


Control 3 Parietal Ctx 


19.2 


Control 3 Temporal Ctx 


25.0 


Control (Path) 1 Parietal Ctx 


94.6 


Control 3 Temporal Ctx 


11.3 


Control (Path) 2 Parietal Ctx 


25.0 


Control (Path) 1 Temporal Ctx 


74.2 


Control (Path) 3 Parietal Ctx 


6.0 


Control (Path) 2 Temporal Ctx 


44.4 


Control (Path) 4 Parietal Ctx |50.7 



Table YC. General screening panel VL5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5274, 
Run 

230762793 


issue Name 


Rel. 

Exp.(%) 
Ag5274, 
Run 

230762793 


Adipose 


1.2 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


7.4 


Bladder 


1.7 


Melanoma* Hs688(B).T 


13.0 


Gastric ca. (liver met.) NCI-N87 


1.0 


Melanoma* M14 


0.1 


Gastric ca. KATO III 


0.2 


Melanoma* LOXMVI 


0.0 


Colon ca. SW-948 


1.4 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


0.7 


Squamous cell carcinoma SCC-4 


1.5 


Colon ca.* (SW480 met) SW620 


0.0 


Testis Pool 


2.1 


Colon ca. HT29 


0.2 


Prostate ca.* (bone met) PC-3 


21.8 


Colon ca.HCT-1 16 


2.1 


Prostate Pool 


0.8 


Colon ca. CaCo-2 


0.3 


Placenta 


0.7 


Colon cancer tissue 


2.4 


Uterus Pool 


0.7 


Colon ca.SWl 116 


0.0 
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Ovarian ca. OVCAR-3 


12.2 . 


Colonc,c£B VUSOE ' 


|ggHJL3-7: 


Ovarian ca. SK-OV-3 


0.2 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR-4 


0.1 


Colon Pool 


3.5 


Ovarian ca. OVCAR-5 


2.8 


SmaHIntestine Pool 


2.1 


Ovarian ca. IGROV-1 


7.2 


Stomach Pool 


L8 


Ovarian ca. OVCAR-8 


3.9 


Bone Marrow Pool 


0.8 


Ovary 


6.3 


Fetal Heart 


1.7 


Breast ca. MCF-7 


0.2 


Heart Pool 


1.5 


Breast ca. MDA-MB-231 


4.9 


Lymph Node Pool 


5.3 


Breast ca. BT 549 


88.3 


Fetal Skeletal Muscle 


1.0 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.8 


Breast ca. MDA-N 


0.0 


Spleen Pool 


3.0 


Breast Pool 


4.9 


Thymus Pool 


2.7 


Trachea 


1.0 


CNS cancer (glio/astro) U87-MG 


27.7* 


L.unf? 


0.9 


CNS cancer (glio/astro) U-l 18-MG 


27.4 


Fetal Lung 


7.2 


CNS cancer (neuro;met) SK-N-AS 


86.5 


Lunffca NCI-N417 


8.2 


CNS cancer (astro) SF-539 


0.0 


LiinP" ca 


0.5 


CNS cancer (astro) SNB-75 


0.5 


Lunffca NCI-H146 


16.2 


CNS cancer (elio) SNB-19 


7.2 


Lun? ca SHP-77 


53.6 


CNS cancer (glio) SF-295 


17.3 


r jirip- ca A 549 


0.0 


Brain (Amygdala) Pool 


19.9 


Lun^ca NCT-H526 


3.6 


Brain (cerebellum^ 


100.0 


Luneca NCI-H23 


40.9 


Brain (fetal") 


44.8 




0.6 


Brain (Hippocampus) Pool 


16.8 


Lunff ca HOP-62 


1.6 


Cerebral Cortex Pool 


24.0 ■ 


Luneca NCI-H522 


57.8 


Brain (Substantia nigra) Pool 


27.4 


Liver 


0.3 


Brain (Thalamus) Pool 


34.2 


Fetal Liver 


0.9 


Brain ( whole} 


42.0 


Liver ca. HepG2 


0.0 


Spinal Cord Pool 


10.5 


Kidney Pool 


4.2 


Adrenal Gland 


1.0 


Fetal Kidney 


3.6 


Pituitary gland Pool 


4.9 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.1 


Renal ca. A498 


0.0 


Thyroid (female) 


0.6 


Renal ca. ACHN 


0.5 


Pancreatic ca. CAPAN2 


0.0 


Renal ca. UO-31 


0.3 


Pancreas Pool 


4.8 



Table YD. Panel 4.1D 





Rel. 




Rel. 




Ep.(%) 




Exp.(%) 


Tissue Name 


Ag5274, 


Tissue Name 


Ag5274, 




Run 




Run 




230472159 




230472159 
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Secondary Thl act [2.3 |hUVEC IL-fbfe 1 /U b 0 C Itft* Ji ^~ 


Secondary Th2 act Jl.6 "jHUVEC JFN eamma iQ?n 


Secondary Trl act 


jO.O |HUVEC TNF alpha + IFN gamma ll5.1 1 


Secondary Thl rest 


jO.O JHUVEC TNF alpha + IL4 


11.7 j 


Secondary Th2rest 


|2.3 jrlUVECIL-ll 


67.8 J 


Secondary Trl rest 


0.0 jLung Microvascular EC none 


jB.Z [ 


Pnmary Thl act 


L Q jLung Microvascular EC TNFalpha 
j ' |+ IL-lbeta 


9.2 j 


Primary Th2 act 


JO.O 


Microvascular Dermal EC none 


ZO.Z 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


9.0 


x iiiLiaxy A II j, iC5t 


0.0 

* 


Bronchial epithelium TNFalpha + 
ILlbeta 


0.0 


Primary Th2 rest 


4.6 


Small airway epithelium none 


o.o 1 


kimiaiy irx rest 


0.0 


Small airway epithelium TNFalpha 
+ IL-lbeta 


0.0 


CD45RA CD4 lymphocyte act 


7.8 


Coronery artery SMC rest 


56.6 f 


CD45RO CD4 lymphocyte act 


0.0 


Coronery artery SMC TNFalpha + 
IL-lbeta 


66.9 


CD8 lymphocyte act 


0.0 


Astrocytes rest 


23.2 1 


Secondary CD8 lymphocyte rest 


0.0 (Astrocytes TNFalpha + IL-lbeta 


14.8 I 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 j 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 


0.0 


2ryThl/Th2/Trl anti-CD95 
CH11 


0.0 


CCD1 106 (Keratinocytes) none 


3l.y j 


LAK cells rest 


0.0 


CCD1 106 (Keratinocytes) 
TNFalpha + IL-lbeta 


9.4 


LAK cells IL-2 


0.0 


Liver cirrhosis 


5.1 | 


LAK cells IL-2+IL-12 


0.0 


NCI-H292 none 


0.0 f 


LAK cells EL-2+IFN gamma |0.0 


NCI-H292 IL-4 


0.O | 


LAK cells IL-2+ IL-1 8 |6.0 


NCI-H292 IL-9 


0.0 J 


LAK cells PMA/ionomycin |0.0 


NCI-H292 IL-13 


O.O 1 


NK Cells IL-2 rest }2.5 


NCI-H292 IFN gamma 


3.6 ■ | 


Two Way MLR 3 day Jo.O 


HPAECnone 


«.4 i 


Two Way MLR 5 day ( 


).0 


HPAEC TNF alpha + IL-1 beta ; 


17.9 j 


Two Way MLR 7 day ( 


).0 


Lung fibroblast none ] 


100.0 1 


PBMCrest ( 


! 


Lung fibroblast TNF alpha + IL-1 \ 
3eta 


>0.8 


PBMCPWM 2 


1.2 . ] 


Lung fibroblast TL-4 1 


!2.2 | 


PBMC PHA-L l 


0.1 ] 


aing fibroblast IL-9 A 


[ 7.6 j 


Ramos (B cell) none c 


».0 ■ I 


„ung fibroblast IL-13 1 


1.8 


Ramos (B cell) ionomycin 0 


>.0 I 


.ung fibroblast IFN gamma 6 


l-l 1 


B lymphocytes PWM 0 


.0 I 


)ermal fibroblast CCD 1 070 rest 2 


8.7 


B lymphocytes CD40L and IL-4 2 


.2 1 
a 


)ermal fibroblast CCD1070 TNF „ 
Ipha 2 


3.3 
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EOL-1 dbcAMP 


9.2 


Deraia] fibrdblM <*CDftf7tftKf~ ' 
beta 


„Jj ... 

28.7 


EOL-i dbcAMP 
PMA/ionomycin 


2.7 


Dermal fibroblast IFN gamma 


16.7 


Dendritic cells none 


0.0 


Dermal fibroblast TT -4 


1^ i 

ID, l 


Dendritic cells LPS 


0.0 


Dermal FiHrnhljKt*; rpct 


JO.O 


Dendritic cells anti-CD40 


0.0 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


0.0 


Neutrophils rest 


0.0 


Monocytes LPS 


6.6 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


1.7 


Macrophages LPS 


0.0 


Thymus 


0.0 


HUVEC none 


48.3 


Kidney 


5.5 


HUVEC starved 


61.1 







Table YE. Panel 5 Islet 



Tissue Name 


ReL 
Exp.0 
Ag5274, 
Run 

307720339 


Tissue Name 


Rel. 

Exp.(%) 
Ag5274, 
Run 

307720339 


97457_Patient-02go_adipose 


15.3 


94709_Donor 2 AM - A_adipose 


13.9 


97476_Patient-07sk.skeletal 
muscle 


0.0 


94710_Donor 2 AM - B_adipose 


15.2 


97477_Patient-07ut_uterus |l3.7 


947 1 l_Donor 2 AM - C_adipose 


19.8 1 


97478_Patient-07pLplacenta 


9.0 


94712_Donor 2 AD - A_adipose 


58.2 


99 1 67 JBayer Patient 1 


51.8 


94713_Donor2AD-B adipose 


29.7 


97482JPatient-08ut_uterus 


24.3 


94714_Donor 2 AD - C_adipose 


34.9 


97483 JPatient-08pl_placenta 


0.0 


94742_Donor 3 U - A .Mesenchymal 
Stem Cells 


62.9 


97486JPatient^9sk_skeIetal 
muscle 


0.0 


94743_Donor 3U-B Jdesenchymal 
Stem Cells 


39.5 


97487J > atient-09ut_uterus 


7.3 


94730_Donor 3 AM - A_adipose 


31.4 


97488_Patient-09pLpIacenta 


11.9 


9473 l_Donor 3 AM - B_adipose 


35.1 


97492jPatienM0ut_uterus 


12.8 


94732_Donor 3 AM - C_adipose 


49.3 


97493 JPatienMOpLplacenta 


5.3 


94733 Donor 3 AD - A adipose 


28.9 


97495_PatienM lgo_adipose 


5.3 


94734_Donor 3 AD - B_adipose 


44.8 


97496JPatient-l lsk_skeletal "~ 
muscle 


3.8 


?4735JDonor 3 AD - C_adipose 


17.7 


97497JPatient-l lut_uterus 


20.9 


77138_JLiverJHepG2untreated ■ 


5.0 


97498 JPatient-1 lpLpIacenta 




73556_Heart_Cardiac stromal cells , 
primary) 


>5.5 


97500J > atient-12go_adipose ; 


>7.0 5 


51735_Small Intestine ; 
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(97501 _PatienM 2sk skeletal 1 
(muscle J 12 - 5 


72409jadn%!™ ' 
Tubule 


15.2 


97502 JPatient-12ut_uterus 


10.2 


82685_Small intestine_Duodenum 


0.0 


97503JPatient-12pLpIacenta 


2.4 


90650_Adrenal_Adrenocortical 
adenoma 


19 9 


94721JDonor2U- 
A_Mesenchymal Stem Cells 


100.0 


72410jadneyJHRCE 


0.0 


94722 J)onor2U- 

B ..Mesenchymal Stem Cells 


43.2 


72411„Kidney_HRE 


25.7 


94723 J)onor2U- 
CJdesenchymal Stem Cells 


63.7 


73139JQtenis_Uterine smooth 
muscle cells 


97.9 



CNS_neurodegeneration_vl.O Summary: Ag5274 This panel confinns the 
expression of this gene at low levels in the brain in an independent group of individuals. 
This gene appears to be slightly down-regulated in the temporal cortex of Alzheimer's 
disease patients. Therefore, up-regulation of this gene or its protein product, or treatment 
with specific agonists for this receptor may be of use in reversing the dementia, memory 
loss, and neuronal death associated with this disease. 

GeneraLscreening_paneLvl.5 Summary: Ag5274 Highest expression of this 
gene is seen in the cerebellum (CT=29.3). Moderate expression of this gene is seen 
throughout the brain. Thus, this gene would be useful for distinguishing brain tissue from 
non-neural tissue, and may be beneficial as a drug target in neurodegenerative disease, and 
specifically disorders that have this brain region as the site of pathology, such as autism and 
the ataxias. Please see PaneLCNS^neurodegeneration for further discussion of potential 
utility in the central nervous system. * 

Low but significant expression is also seen in pancreas. This gene encodes a protein 
with homology to carnitine palmitoyltransferase. Giannessi et al has shown that inhibition 
of this enzyme produces a significant reduction in serum glucose levels (J Med Chem 2001 
Jul 19;44(15):2383-6). Thus, modulation of this enzyme may also be useful in the treatment 
of obesity and/or diabetes. 

Panel 4.1D Summary: Ag5274 Highest expression of this gene is seen in 
untreated lung fibroblasts. Low, but significant expression is also seen in a cluster of 
treated and untreated lung and dennal fibroblasts. Low levels of expression are also seen in 
coronary artery SMCs, and HUVECs. This profile suggests that this gene could be used to 
differentiate between these cells and other cells samples. In addition, this gene product may 
be involved in inflammatory conditions of the lung and skin. 
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Panel 5 Islet Summary: Ag5274 Expression is ]Imited , t6 a^^iy r aefivair6fti ,r 
mesenchymal stem cells (CTs=34.5). 

Z. CG148431-01 and CG148431-02: AMINOTRANSFERASE 
SIMILAR TO SERINE PALMOTYLTRANSFERASE. 

5 Expression of gene CG148431-01 and CG148431-02 was assessed using the 

prinier-probe set Ag5627, described in Table ZA. Results of the RTQ-PCR runs are shown 
in Tables ZB, ZC, ZD and ZE. Please note that CG148431-02 represents a full-length 
physical clone of the CG148431-01 gene, validating the prediction of the gene sequence. 
Table ZA. Probe Name Ag5627 



Primers 


Sequenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 r -gggctcctataacttccttggt-3 1 


22 


555 


342 


Probe 


TET-5 * -tcctcatagactcatcatacttggctg 
ca-3 1 -TAMRA 


29 


579 


343 


Reverse 


5 ■ -cctgtgccatacacctctaaaa-3 ■ 


22 


620 


344 



Table ZB.CNS neurodegeneration vl.Q 



Tissue Name 


Rel. 

Exp.(%) 
Ag5627, 
Run 

246956910 


Rel. 

Exp.(%) 
Ag5627, 
Run 

264979289 


issue Name 


Rel. 

Exp.(%) 
Ag5627, 
Run 

246956910 


Rel. 

Exp.(%) 
Ag5627, 
Run 

264979289 


AD 1 Hippo 


17.4 


57.0 


Control (Path) 3 
Temporal Ctx 


6.4 


8.2 


AD 2 Hippo 


67.8 


4.8 


Control (Path) 4 
Temporal Ctx 


10.3 


24.0 


AD 3 Hippo 


50.0 


62.4 


AD 1 Occipital Ctx 


11.8 


26.8 


AD 4 Hippo 


19.1 


30.8 


AD 2 Occipital Ctx 
(Missing) 


0.0 


0.0 


AD 5 Hippo 


17.0 


31.2 


AD 3 Occipital Ctx 


4-2 


25.9 


AD 6 Hippo 


100.0 


86.5 


AD 4 Occipital Ctx 


20.0 


27.9 


Control 2 Hippo | 


24.1 


31.6 


AD 5 Occipital Ctx 


37.4 


17.0 


Control 4 Hippo 


50.7 


70.7 


AD 6 Occipital Ctx 


29.1 


22.4 


Control (Path) 3 
Hippo 


21.0 


24.3 


Control 1 Occipital 
Ctx 


3.9 


12.1 


AD 1 Temporal Ctx 


43.8 


65.5 


Control 2 Occipital 
Ctx 


20.6 


29.9 
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AD 2 Temporal Ctx 


47.6 


100.0 


! IP IP If /■ 

Control 3 Occipital" 
Ctx 


9.3 


~4t *Ja ., 
19.9 


AD 3 Temporal Ctx 


11.0 


23.0 


Control 4 Occipital 
Ctx 


16.3 


44.1 


A TV A TP 1 /"tA 

AD 4 Temporal Ctx 


20.4 


33.9 


Control (Path) 1 
Occipital Ctx 


49.0 


58.2 


AD 5 Inf Temporal 
Ctx 


31.0 


31.2 


Control (Path) 2 
Occipital Ctx 


6.6 


15.2 


AD 5 Sup Temporal 
Ctx 


51.1 


63.3 \ 


Control (Path) 3 
Occipital Ctx 


0.0 


1.6 


AD 6 M Temporal 
Ctx 


68.8 


87.7 


Control (Path) 4 
Occipital Ctx 


23.3 


14.3 


AD 6 Sup Temporal 
Ctx 


56.3 


97.3 


Control 1 Parietal 
Ctx 


13.1 


18.3 


Control 1 Temporal 
Ctx 


7.3 


4.5 


Control 2 Parietal 
Ctx 


31.6 


68.8 


Control 2 Temporal 
Ctx 


12.9 


31.6 


Control 3 Parietal 
Ctx 


7.9 


19.8 


Control 3 Temporal 
Ctx 


/.y 


15.0 


Control (Path) 1 
Parietal Ctx 


63.7 


87.1 


Control 3 Temporal 
Ctx 


13.8 


15.6 


Control (Path) 2 
Parietal Ctx 


51.1 


57.4 


Control (Path) 1 
Temporal Ctx 


30.1 


46.0 


Control (Path) 3 
Parietal Ctx 


3.1 


6.1 


Control (Path) 2 
Temporal Ctx 


28.7 


39.5 


Control (Path) 4 
Parietal Ctx 


54.7 


59.5 



Table ZC. Panel 4.1D 



Tissue Name 


Rel. 

Ep.(%) 

Ag5627, 

Run 

246490777 


Tissue Name 


Rel. 

Exp.(%) 
Ag5627, 
Run 

246490777 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.4 


HUVEC IFN gamma 


16.7 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


0.3 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


1.2 


Secondary Trl rest 


o.b 


Lung Microvascular EC none 


0.4 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ BL-lbeta 


0.0 


Primary Th2 act 


0.2 


Microvascular Dermal EC none 


0.0 


Primary Trl act . 


0.2 


Microsvasular Dermal EC 
TNFatpha + IL-lbeta 


0.0 
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Primary Thl rest 


0.0 


Bronchial epE&hfWafpW" 
ELlbeta 


. , ' *Zl\ «1L .dt Jt 
R 4 


Primary Th2 rest 


0.0 


Small airway epithelium none 


18.7 


Primary Trl rest 


0.0 


Small airway epithelium TNFalpha 
+ JLU-iDeta 


1 24.3 


CD45RA CD4 lymphocyte act 


2.7 


Coronery artery SMC rest 


3.3 


CD45RO CD4 lymphocyte act 


6.8 


Coronery artery SMC TNFalpha + 
IL-lbeta 


2.8 


LDo lymphocyte act 


0.0 


Astrocytes rest 


3.9 


Secondary CD8 lymphocyte rest 


0.8 


Astrocytes TNFalpha + IL-lbeta 


1.4 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


8.0 


CD4 lymphocyte none 


0.0 


KU-812 (Basophil) 
PMA/ionomycin 




2ry Thl/Th2/Trl_anti-CD95 
pui 1 


0.4 


CCD1 106 (Keratinocytes) none 


17.4 


LAK cells rest 


0.0 


CCD1106 (Keratmocytes) 
TNFalnha 4- TT -Ihpta 


24.3 


LAK cells EL-2 


0.0 


T .ivpr r*5rrIir*Qf c 




LAK cells IL-2+IL-12 


0.2 


NCI-H292 nnnp 


1 A O 
J.I/.Z 


LAK cells EL-2+IFN gamma 


0.0 


NCT-H2Q? TT -4. 


->o.a 


LAK cells EL-2+ IL-18 


0.0 


NCI-H292 EL-9 


21.5 


jL»/\rw ceus Jrivi/vionomycui 


U.2 


NCI-H292 EL-13 


27.7 


injv i^eus LL.-Z rest 


11.8 


X T/*1T T TV - * /\ A ill. r 

NCI-H292 IFN gamma 


18.3 


i wo w ay jyULK j oay 


U.4 


HPAEC none 


0.8 


Twin Wqv TVyfT D <C rla™ 

i wo way iyllk j aay 




HPAEC TNF alpha + BL-1 beta 


0.3 


x wo way ivjjlk / uay 


U.U 


Lung fibroblast none 


21.5 


PBMC rest 


0.0 


^ung fibroblast TNF alpha + IL- 1 
beta 


2.7 


r JDjyxv^ r w ivi 


0.0 


Lung fibroblast IL-4 


10.2 


PBMC PHA-L 


1.3 


Lung fibroblast 1L-9 


6.2 




Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


1.3 


• 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IFN gamma 


43.5 




B lymphocytes PWM 


o.o 


Dermal fibroblast CCD1070 rest 


D.O 


] 


3 lymphocytes CD40L and IL-4 


D.O 


Dermal fibroblast CCD1070 TNF 
alpha 


i i 

A* 1 


] 


50L-1 dbcAMP 


3.5 


Dermal fibroblast CCD1070 IL-1 
xta 


1.6 


I 

i 


hOT -1 Hhr AMP 

5 MA/ionomycin 


).0 


Dermal fibroblast IFN gamma ; 


59.5 


dendritic cells none 


1.1 I 


Dermal fibroblast IL-4 ] 


12.0 


i 


dendritic cells LPS ( 


).0 I 


Dermal Fibroblasts rest 1 


16.0 


r 


)endritic cells anti-CD40 ( 


).o ■ r 


^eutrophOsTNFa+LPS C 


).0 


» 


Monocytes rest C 


).o r 


Neutrophils rest C 


>.o 


h 


/fonocytes LPS C 


).0 ( 


?olon 3 


.0 


I 


/lacrophages rest C 


(.0 I 


-ung 4 


.6 


[Macrophages LPS C 


1.0 


Thymus 3 


.5 
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HUVECnone 


0.7 


p^; — ip-c-T/uoaa- 




HUVEC starved 


2.9 







Table ZD. Panel 5 Islet 



5 



Tissue Name 


|Rel. 

Exp.(%) 
Ag5627 

Tfim 

JTvUU 

2793714J 
3 


Rel. 

Exp.(%) 
Ag5627, 
xvun 
l 31285251 
5 


Tissue Name 


Rel. 

Exp.(% 

Ag5627 

Run 

2793714 


Rel. 
) Exp.(%) 
, Ag5627, 

Run 
1 3128525 


9745?JPatient-02go_adipo! 
e 


5 0.7 


1.7 


94709_Donor2AM- 
A adipose 


1.2 


1.6 


97476_Patient-07sk_skeIetc 
1 muscle 


l 0.0 


0.0 


94710_Donor2AM- 


1.1 


1.7 


97477_Patient-07ut_uteras 


0.4 


0.5 


94711J)onor2AM- 

C adinn<;p 


0.8 


1.4 


97478 J>atient-07pLplacen1 
a 


40.3 


46.0 


94712J)onor2AD- 
A adir>n<;f* 


2.7 


2.0 


99167JBayer Patient 1 


0.1 


0.1 


94713J)onor2AD- 
B adinnsf* 


4.0 


3.0 


97482_Patient-08ut_utems 


0.2 


0.2 


94714JDonor2AD- 
C adiDose 


3.0 


3.0 


97483 JPatient^8pLplacent 
a 


82.9 


100.0 


94742JDonor3U- 

A Mesenclrvmal Sltpm fVITo 


0.4 


0.4 


97486JPatient-09sk_skeIeta 
1 muscle 


0.2 


0.1 


94743 J)onor3U- 

B Mesenchymal Stem fVI lc 


0.3 


0.6 


97487 JPatient-09ut_uterus 


0.2 


0.5 


94730JDonor3AM- 
A_adipose 


3.5 


3.7 


97488_j>arient-09pLpJacent 
a 


29.9 


25.5 


94731_Donor3AM- 
B_adipose 


5.3 


5.6 


97492JPatient-10ut_uterus 


0.3 


0.4 


94732JDonor3AM- 
CLadipose 


3.9 


4.8 


97493 JPatient-lOpLplacent 
a 


100.0 


71.7 


94733_Donor3AD- 
A__adipose 


2.6 


3.5 


97495 JPatient-1 lgo_adipos 
e 


1.2 


5.9 


94734_Donor3AD- 
B_adipose 


2.8 


3.6 


97496_Patient-llsk_skeIeta 
1 muscle 


).2 ( 


U < 


M735_Donor3AD- 
:_adipose_ ( 


).5 


).8 


97497JPatient-llut__uterus ( 


).5 ( 


).8 

c 


r7138JLiverJHepG2untreate . 
I 


59.5 < 


G.2 


97498_Patient-llpLpIacent „ 
a 2 


!8.1 2 


11.6 

c 


r 3556JHeart_.Cardiac stromal f 
ells (primary) 


>.l 


>.o 


97500_Patient-12go__adipos 
e 1 


.0 1 


-8 8 


1735_SmaIl Intestine 1 


.8 1 


.9 


97501JPatient-12s]eskeleta 
1 muscle ^ 


.5 


* I 


2409_JCidneyJ>roximal 
involuted Tubule 


8.2 1 


9.1 
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97502JPatient-l 2ut_uterus 


0.3 


0.4 


intestine_Duodenum 


1.3 


1.1 


97503JPatient-12pl_p]acent 
a 


oj.y 


00.5 


90650_Adrena]_ J Adrenocortic 
a] adenoma 


0.6 


0.4 


94721JDonor2U- 
AJMesenchymal Stem 
Cells 


1.2 


1.3 


72410_Kidney_HRCE 


3.7 


4.9 


94722_Donor2U- 
BJVTesenchymal Stem 
Cells 


0.6 


0.8 


72411_Kidney_HRE 


1.6 


1.7 


94723 JDonor2U- 
C34esenchymal Stem 
Cells 

i — [ 


1.0 


1.3 


73139_Uterus_Uterine 
smooth muscle cells 


1.0 


0.7 



Table Zfi. general oncology screening panel v 2.4 



r 

I HP" IT 

Tissue Name 


Rel. 

Exp.(%) 
Ag5627, 
Run 

268787222 


Tissue ame 


Kei. 

Exd.(%) 
Ag5627, 
Run 

268787222 


{Colon cancer 1 


2.8 


Bladder NAT 2 


0.3 


JColonNAT 1 


2.7 


[Bladder NAT 3 


0.2 


jColon cancer 2 


7.8 


Bladder NAT 4 


1.1 


pionNAT2 


3.1 


Prostate adenocarcinoma 1 


11.8 1 


JColon cancer 3 


5.7 


Prostate adenocarcinoma 2 


1.0 


Colon NAT 3 ^ 


PT4 


Prostate adenocarcinoma 3 


8.6 


Colon malignant cancer 4 


3.0 


Prostate adenocarcinoma 4 


1.7 


(Colon NAT 4 


2.4 


Prostate NAT 5 


1.1 


jLung cancer 1 


2.9 


Prostate adenocarcinoma 6 


2.6 


[Lung NAT 1 


1.1 


Prostate adenocarcinoma 7 


3.3 


jLung cancer 2 


16.2 


Prostate adenocarcinoma 8 


0.6 


Lung NAT 2 


2.3 


Prostate adenocarcinoma 9 


6.5 


[Squamous cell carcinoma 3 


4.8 


Prostate NAT 10 


1.4 


Lung NAT 3 


0.5 


Kidney cancer 1 


14.2 


[Metastatic melanoma 1 


8.7 


Kidney NAT 1 


7.6 


Melanoma 2 


3.7 


Kidney cancer 2 


100.0 


(Melanoma 3 < 


hi ) 


Kidney NAT 2 


15.6 


(Metastatic melanoma 4 


16.3 ] 


Sidney cancer 3 


58.7 


[Metastatic melanoma 5 : 


10,2 1 


CidneyNAT3 t 


5.5 


[Bladder cancer 1 


-.3 I 


Cidney cancer 4 ] 


1.8 


gladder NAT 1 c 


1.0 I 


Cidney NAT 4 I 




[Bladder cancer 2 3 


.9 
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CNS_neurodegeneration_vl.O Summary: AgS^WVe^ml^illaml 7 3 
probe-primer sets are in good agreements. This panel confirms the expression of this gene 
at low levels in the brain in an independent group of individuals. This gene is found to be 
upregulated in the temporal cortex of Alzheimer's disease patients. Therefore, therapeutic 
modulation of the expression or function of this gene may decrease neuronal death and be 
of use in the treatment of this disease. 

Panel 4.1D Summary: AgS627 Highest expression of this gene is detected in 
kidney. Moderate to low levels of expression of this gene is also seen in activated naive and 
memory T cells, IL-2 treated NK cells, IFN gamma activated HUVEC cells, cytokine 
activated bronchial epithelial cells, astrocytes, resting and activated small airway epithelial 
cells, coronery artery SMC cells, basophils, keratinocytes, mucoepidermoid NCI-H292 
cells, lung and dermal fibroblast, liver cirrhosis sample and normal tissues such as colon, 
lung, and thymus. Therefore, therapeutic modulation of this gene or its protein product 
through the use of small molecule drug may be useful in the treatment of autoimmune and 
inflammatory diseases such as asthma, allergies, inflammatory bowel disease, lupus 
erythematosus, psoriasis, rheumatoid arthritis, and osteoarthritis. 

Panel 5 Islet Summary: Ag5627 Two experiments with same probe and primer 
sets are in good agreements. Highest expression of this gene is detected in placenta of 
diabetic and nondiabetic patients (CTs=26.4-26.7). Moderate to high levels of expression of 
this gene is also seen in liver HepG2 cell line, adipose, small intestine and kidney. This 
gene codes for a homolog of Serine palmitoyltransferase 2. Serine palmitoyltransferase 
catalyzes the first, rate limiting step in de novo ceramide biosynthesis. C2-ceramide inhibits 
GLUT4 translocation by inhibiting Akt phosphorylation and activation in 3T3-L1 
adipocytes, independently of effects on IRS-1 (Summers et al., 1998, Mol Cell Biol 
18:5457-64, PMID: 9710629). Ceramide downregulates PDE3B and induces lipolysis in 
3T3-L1 cells. Ceramide effects are reversed by troglitazone (Mei et al., 2002, Diabetes 51: 
631-7, PMID: 11 872660). Palmitate-induced insulin resistance involves elevation of de 
novo ceramide synthesis in C2C12 myotubes (Schmitz-Peiffer et al., 1999, J Biol Chem 
274:24202, PMID: 10446195). Therefore, inhibition of the novel serine 
palmitoyltransferase through the use of small molecule drug may be beneficial in the 
treatment of diabetes. 

general oncology screening panel_v_2.4 Summary: Ag5627 Highest expression 
of this gene is detected in kidney cancer (CT=27.5). Moderate to high expression of this 
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gene is also seen in normal and cancer samples derived frSfficofori, M^ 9 VlSSdctrp rd'StiftdT wir 
and kidney. Moderate levels of expression of this gene is also seen in melanoma and 
metastatic melanoma samples. Expression of this gene is strongly associated with kidney, 
lung and bladder cancers as compared to the corresponding normal tissues. Therefore, 
expression of this gene may be used as diagnostic marker for detection of these cancers and 
also, therapeutic modulation of this gene or its protein product may be useful in the 
treatment of melanoma, colon, lung, bladder, prostate and kidney cancers. 

AA. CG148888-01: GALNAC 4-SULFOTRANSFERASE. 

Expression of gene CG148888-01 was assessed using the primer-probe set Ag6854, 
described in Table AAA. Results of the RTQ-PCR runs are shown in Table AAB. Please 
note that CG148888-01 represents a full-length physical clone. 

Table AAA. Probe Name Ag6854 



Primers 


Sequenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -accccagagccgcctggt-3 ' 


18 


369 


345 


Probe 


TET-5 ' -cttggcctgatgttgaactttattcctg 
gcacc-3 • -TAMRA 


33 


408 


346 


Reverse 


5 1 -cagcctgcaggaccctacg-3 ' 


19 


458 


347 



Table AAB, General screening, panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag6854, 
Run 

278020603 


issue Name 


Rel. 

Exp.(%) 
Ag6854, 
Run 

278020603 


Adipose 


0.0 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.1 


Melanoma* Hs688(B).T 


0.2 


Gastric ca. (liver met.) NCI-N87 


0.0 


Melanoma* M14 


0.0 


Gastric ca. KATO III 


0.0 


Melanoma* LOXMVI 


0.0 


Colon ca. SW-948 


0.0 


Melanoma* SK-MEL-5 


0.3 


Colon ca. SW480 


0.1 


Squamous cell carcinoma SCC-4 


0.1 


Colon ca.* (SW480 met) SW620 


0.0 


Testis Pool 


0.2 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


0.0 


Colon ca.HCT-1 16 


0.0 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


0.0 
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Placenta 


0.0 


Colon cancfMsKiil"/ - U S OP 




Uterus Pool 


j 0.0 


Colon ca. SW1116 


A A 
0.0 


Ovarian ca. OVCAR-3 


0.0 


Colon ca. CoIo-205 


n a 
0.0 


Ovarian ca. SK-OV-3 


0.0 


Colon ca. SW-48 


A A I 

0.0 j 


Ovarian ca. QVCAR-4 fo.O 


Colon Pool 


AO 1 

O.z j 


Ovarian fWJT* A"D K 


0.1 


Small Intestine Pool 


A 1 I 

0.1 


uvanan ca. iojkuy-1 


0.0 


Stomach Pool 




v/vanan ca. uvlak-o 


0.0 jBone Marrow Pool 


o.i 1 


Ovaiy 


0.2 jFetal Heart 


A O 1 

0.3 j 


jDreasc ca. mux*- / 


0.7 jHeartPool 


A A "1 

0.0 


ureast ca. mda-JYLd-231 


0.0 jLymph Node Pool 


AC 1 
^ f 


ureast ca. Jt> 1 549 


0.0 jFetal Skeletal Muscle 


A A 1 

0.0 J 


ureast ca. 1471) 


0.0 


Skeletal Muscle Pool 


0.0 


.breast ca. MDA-N 


0.0 


Spleen Pool 


0.6 j 


.breast root 


0.2 


Thymus Pool 


0.5 i 


iracMea 


0.3 


CNS cancer (glio/astro) U87-MG 


0.0 | 


Lung 


0.2 


CNS cancer (glio/astro) U-l 18-MG 


o.o 1 


Fetal Lung 


0.0 


CNS cancer (neuro;met) SK-N-AS 


2.2 j 


T ^ — _ XT/TT XT jl -| T 

Lungca. NCI-N417 


0.0 jCNS cancer (astro) SF-539 


0.0 I 


Lung ca. LX-1 


0.0 


CNS cancer (astro) SNB-75 


0.7 1 


Lung ca. JNd-rll4o 


0.0 j 


CNS cancer (glio) SNB-19 


0.0 j 


Lung ca. SHP-77 


100.0 


CNS cancer (glio) SF-295 


0.1 j 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


3.7 j 


JLUng ca. JNL.1-H520 


0.4 


Brain (cerebellum) 


8.8 |" 


Lung ca. JNUJ-JtiZ3 


0.2 


Brain (fetal) 


16.2 J 


Lungca.NCI-H460 


0.1 


Brain (Hippocampus) Pool 


3.6 J 


Lungca. HOP-62 


3.0 


Cerebral Cortex Pool 


3.7 | 


Lungca.NCI-H522 


1.4 


Brain (Substantia nigra) Pool < 


t.6 j 


Liver 


).0 


Brain (Thalamus) Pool 


).0 i 


Fetal Liver ( 


).0 


Brain (whole) t 


1 C I 


Liver ca.HepG2 ( 


).0 I 


Spinal Cord Pool l 


L7 j 


Kidney Pool ( 


).0 , 


\drenal Gland ( 




Fetal Kidney ( 


(.0 I 


'ituitary gland Pool j 


.0 ! 


Renal ca. 786-0 C 


1.0 < 


Salivary Gland c 


f.O j 


Renal ca. A498 C 


i.O 1 


Tiyroid (female) o 


.2 | 


Renal ca.ACHN 0 


.0 I 


>ancreatic ca. CAPAN2 0 


.1 J 


Renal ca. UO-31 0 


•0 {Pancreas Pool 6 


.2* ~ | 



General_screening_panel_vl.6 Summary: Ag6854 Highest expression of this 
gene is seen in a lung cancer cell line (CT=27.8). Thus, expression of this gene could be 
used to differentiate between this sample and other samples on this panel and as a marker to 
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* 

detect the presence of lung cancer. Furthermore, therapdttti^nlbflul^^waid ex^ssfon - 31 
or function of this gene may be effective in the treatment of lung cancer. 

This gene is also expressed at moderate to low levels in the CNS, including the 
hippocampus, thalamus, substantia nigra, amygdala, cerebellum and cerebral cortex. 
5 Therefore, therapeutic modulation of the expression or function of this gene may be useful 
in the treatment of neurological disorders, such as Alzheimer's disease, Parkinson's disease, 
schizophrenia, multiple sclerosis, stroke and epilepsy. 

AB. CG149008-01: NOVEL SODIUM/HYDROGEN 
EXCHANGER FAMILY MEMBER. 

10 Expression of gene CG149008-01 was assessed using the primer-probe set Ag5630, 

described in Table ABA. Results of the RTQ-PCR runs are shown in Tables ABB, ABC, 
ABD and ABE. 

Table ABA. Probe Name Ag5630 

15 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ■ -tattttctgggtcaggctgat-3 ' 


21 " 


770 


348 


Probe 


TET-5 1 -tctctaaactcaacatgacagacagtt 
ttg-3 ■ -TAMRA 


30 


795 


349 


Reverse 


5 1 -cagatattagggagccaaacg-3 ' 


21 


825 


350 



Table ABB. CNS neurodegeneration vl.O 

20 



Tissue Name 


ReL 

Exp.(%) 
Ag5630, 
Run 

246956911 


issue Name 


Rel. 

Exp.(%) 
Ag5630, 
Run 

246956911 


AD 1 Hippo 


9.3 • 


Control (Path) 3 Temporal Ctx 


9.3 


AD2Hippo 


31.4 jControl (Path) 4 Temporal Ctx 


14.5 


AD 3 Hippo ' |5.5 IAD 1 Occipital Ctx 


7.5 


AD 4 Hippo 


8.4 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


62.0 


AD 3 Occipital Ctx 


4.5 


AD 6 Hippo 


46.0 


AD 4 Occipital Ctx 


18.9 


Control 2 Hippo 


31.4 


AD 5 Occipital Ctx 


13.9 


Control 4 Hippo 


15.9 


AD 6 Occipital Ctx 


46.3 


Control (Path) 3 Hippo 


10.4 


Control 1 Occipital Ctx 


3.8 
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AD 1 Temporal Ctx 


12.0 


Control 2ttM^« SOB ^ 




AD 2 Temporal Ctx 


41.8 


Control 3 Occipital Ctx 


6.1 


AD 3 Temporal Ctx 


2.3 


Control 4 Occipital Ctx 


13.2 


AD 4 Temporal Ctx 


25.7 


Control (Path) 1 Occipital Ctx 


62.0 


AD 5 Jnf Temporal Ctx 


100.0 


Control (Path) 2 Occipital Ctx 


10.5 


AD 5 SupTemporal Ctx 


48.6 


Control (Path) 3 Occipital Ctx 


8.4 


AD 6 Inf Temporal Ctx 


36.9 


Control (Path) 4 Occipital Ctx 


11 R 

X X -<J 


AD 6 Sup Temporal Ctx 


45.7 


Control 1 Parietal Ctx 


10.4 


Control 1 Temporal Ctx 


14.3 


Control 2 Parietal Ctx 


49.0 


Control 2 Temporal Ctx 


48.6 


Control 3 Parietal Ctx 


20.3 


Control 3 Temporal Ctx - 


12.8 


Control (Path) 1 Parietal Ctx 


44.1 


Control 4 Temporal Ctx 


14.1 


Control (Path) 2 Parietal Ctx 


22.7 


Control (Path) 1 Temporal Ctx 


52.5 


Control (Path) 3 Parietal Ctx 


8.2 


Control (Path) 2 Temporal Ctx 


33.9 


Control (Path) 4 Parietal Ctx |35. 1 



Table ABC General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5630, 
Run 

245065625 


issue Name 


Rel. 

Exp.(%) 
Ag5630, 
Ron 

245065625 


Adipose 


4.2 


Renal ca. TK-10 


32.8 


Melanoma* Hs688(A).T 


21.9 


Bladder 


9.5 


Melanoma* Hs688(B).T 


19.2 


Gastric ca. (liver met.) NCI-N87 


100.0 


Melanoma* M14 


41.2 


Gastric ca. KATOUl 


52.1 


Melanoma* LOXMV1 


25.2 


Colon ca. SW-948 


5.1 


Melanoma* SK-MEL-5 


20.0 


Colon ca. SW480 


27.2 | 


Squamous cell carcinoma SCC-4 


8.4 


Colon ca* (SW480 met) SW620 


22.2 


Testis Pool 


9.1 


Colon ca. HT29 


10.5 


Prostate ca.* (bone met) PC-3 


5.8 


Colon ca.HCT-1 16 


15.6 


Prostate Pool 


3.0 


Colon ca. CaCo-2 


25.9 


Placenta 


16.7 




Colon cancer tissue 


12.9 


Uterus Pool 


43 


(Colon ca. SW1116 


3.4 j 


Ovarian ca. OVCAR-3 


35.6 


|CoIon ca. Coio-205 


19.8 


Ovarian ca. SK-OV-3 


15.4 


JColon ca. SW-48 


12.6 


Ovarian ca. OVCAR-4 


9.5 


jColon Pool 


6.4 


Ovarian ca. OVCAR-5 


44.8 


JSmall Intestine Pool 


4.0 


Ovarian ca.IGROV-1 


13.9 


|Stomach Pool 


3.7 


Ovarian ca. OVCAR-8 


8.0 


jBone Marrow Pool 


2.9 


Ovary 


3.8 


[Fetal Heart 


41 


Breast ca. MCF-7 


14.9 


TSfeartPooI 


3.3 


Breast ca. MDA-MB-231 


25.2 


jLymph Node Pool 


S.8 
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Breast ca. BT 549 


32.1 jFetal SkeleMlilW' UB Q B ' 




Breast ca. T47D 


18.7 


Skeletal Muscle Pool 


15.6 


Breast ca. MDA-N 


9.3 


Spleen Pool 


5.4 


Breast Pool 


1.7 


Thymus Pool 


r-f S 

7.6 


Trachea 


18.4 


CNS cancer (glio/astro) U87-MG 


74.2 


Lung 


1.7 


CNS cancer (glio/astro) U-l 18-MG 


1 A A 

34.4 


Fetal Lung 


9.2 


CNS cancer (neuro;met) SK-N-AS 


8.5 


Lung ca. NCI-N417 


4.8 


CNS cancer (astro) SF-539 


11.9 


Lung ca. LX-1 


24.1 


CNS cancer (astro) SNB-75 


43.2 


Lung ca. NCI-H146 


3.6 


CNS cancer (glio) SNB-19 


12.9 


Lung ca. SHP-77 


14.0 JCNS cancer (glio) SF-295 


30.8 


Lune ca. A549 


35.4 


Brain (Amygdala) Pool 


4.9 


Lung ca. NCI-H526 


3.5 


Brain (cerebellum) 


23.7 


Luneca NCI-H23 


23.5 


Brain (fetal) 


6.5 


Luneca NCI-H460 

— "^g) V**.. x 1 *JL JL 1TVV 


6.7 


Brain (Hippocampus) Pool 17.5 


Lunffca HOP-62 


7.6 


Cerebral Cortex Pool 15.3 


Luns: ca. NCI-H522 


8.5 


Brain (Substantia nigra) Pool |4.3 


Liver 


4.2 


Brain (Thalamus) Pool |7.4 


Fetal Liver 


15.8 


Brain (whole) 


5.4 


Liver ca. HepG2 


5.7 


Spinal Cord Pool 


5.4 


Kidney Pool 


7.7 


Adrenal Gland 


24.1 


Fetal Kidney 


5.0 


Pituitary gland Pool 


U 


Renal ca. 786-0 


19.9 


Salivary Gland 


L3.2 


Renal ca. A498 


14.3 


rhyroid (female) I 


5.1 


Renal ca. ACHN 


3.9 ] 


Pancreatic ca. CAPAN2 S 


16. 1 


Renal ca.UO-31 

< 


32.1 ] 


5 ancreas Pool J93 



Table ABD. Panel 4.1D 



Tissue Name 


Rel. 
Exp.(% 
Ag5630, 
Run 

246490808 


Tissue Name 


Rel. 

Exp.(%) 
Ag5630, 
Run 

246490808 


Secondary Thl act 


52.9 


HUVEC DL-lbeta 


21.9 


Secondary Th2 act 


86.5 


HUVEC IFN gamma 


20.2 


Secondary Trl act 


14.5 ■ 


HUVEC TNF alpha + IFN gamma 


6.7 


Secondary Thl rest 


2.2 


HUVEC TNF alpha + IL4 


4.6 


Secondary Th2 rest 


1.7 


HUVEC IL-11 


12.6' 


Secondary Trl rest ' 


0.0 


Lung Microvascular EC none 


31.6 


Primary Thl act 


0.8 


Lung Microvascular EC TNFalpha 
-h IL-lbeta 


9.4 


Primary Th2 act 


42.6 (Microvascular Dermal EC none 


0.7 j 
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Primarv Trl act 


^5 4 


MicrosvasufaA^ *J ei ✓ 
TNFalpha -J- IL-lbeta 


J.jL 


Primarv Thl rest 


1 9 


Bronchial epithelium TNFalpha + 
ILlbeta 




Primary Th2 rest 


3.4 


Small airway epithelium none 


4.5 


Primary Trl rest 


0.3 


Small airway epithelium TNFalpha 
+ IL-lbeta 


29.1 


CD45RA CD4 lymphocyte act 


30.6 


Coronery artery SMC rest 


9.9 


CD45RO CD4 lymphocyte act 


49.3 


Coronery artery SMC TNFalpha + 
IL-lbeta 


13.3 


CD8 lymphocyte act 


4.6 


Astrocytes rest 


2.6 


Secondary CD8 lymphocyte rest 


29.9 


Astrocytes TNFalpha + IL-lbeta 


4.2 


Secondary CD8 lymphocyte act 


6.6 


KU-812 (Basophil) rest 


4.9 


{~*D4. 1 vmnhnrvtp nrvn^ 




KU-812 (Basophil) 
PMA/ionomycin 


11 o 

ll.V 


2ry Thl/Th2/Trl_anti-CD95 
CH11 


2.5 


CCD1 106 (Keratinocytes) none 


28.3 


LAK cells rest 


in 


CCD1106 (Keratinocytes) 
1 INraipna + iL-l beta 


18.6 


-L/xVXV. UCJ1I> XL* 


Q 7 


Liver cirrhosis 


4.6 




Z.J 


JNd-rlzyz none 


463 


T ATST r»dl1c TT 9,i.l WIST o-ommo 


IT 3 
1 I.J 


INCi-riZyZ JJL-4 


46.0 


LAK cells IL-2+IL-18 


9.5 


NCI-H292 IL-9 


69.3 


LAK cells PMA/ionomycin 


363 


NCI-H292 IL-13 


59.0 


NK Cells IL-2 rest 


17.0 


NCI-H292IFN gamma 


33.9 


Two Way MLR 3 day 


9.4 


HPAEC none 


12.9 


Two Way MLR 5 day 


1.0 


HPAEC TNF alpha + IL-l beta 


70.2 


Two Way MLR 7 day 


7.0 


Lung fibroblast none 


14.2 


PBMC rest 


0.9 


Lung fibroblast TNF alpha + IL-l 
beta 


20.0 


rvt-* i m~/~"\ Tver rm. 

PBMC PWM 


9.9 


Lung fibroblast IL-4 


12.4 


PBMC PHA-L 


8.4 


Lung fibroblast IL-9 


4.8 


Ramos (B cell) none 


1.4 


Lung fibroblast EL-13 


2.7 


Ramos (B cell) ionomycin 


28.5 


Lung fibroblast DFN gamma 


27.7 


B lymphocytes PWM 


19.6 


Dermal fibroblast CCD1070 rest 


33.9 


B lvmnhocvtes CD40T and TT -4 


60.1 


Dermal fibroblast CCD1070 TNF 
alpha 




EOL-1 dbcAMP 


3.8 


Dermal fibroblast CCD1070 IL-l 
□eta 


18.3 


FOT -1 rfhrAMP 

PMA/ionomycin 


).4 


Dermal fibroblast IFN gamma 


19.3 


Dendritic cells none J 


).2 


Dermal fibroblast IL-4 : 


$7.4 


Dendritic cells LPS : 


*.2 1 


Dermal Fibroblasts rest J 


15.8 


Dendritic cells anti-CD40 : 


*.8 1 


Neutrophils TNFa+LPS : 


(7.6 


Monocytes rest ( 


).0 I 


Neutrophils rest A 


H.2 


Monocytes LPS \ 


LOO.O ( 


rolon 


.5 
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p civ y sob/ 



Macrophag es rest 



6.0 



Lung 



Macrophages LPS 



10.6 



Thymus 



2.4 



HUVECnone 



12.6 



Kidney 



17.2 



HUVEC starved 



21.5 



Table ABE. Panel 5 Islet 



Tissue Name 


ReL 

S2jA\}*\ fO 

Ag5630, 
Run 

279370866 


Tissue Name 


Rel. 

Ag5630, 
Run 

279370866 


97457 JPatient-02go_adipose 


15.5 


94709_Donor 2 AM - A_adipose 


26.6 


97476J > atient-07sk_skeletal 
muscle 






71 0 


97477 JPatient-07ut_uterus 


5.0 


9471 lJDonor 2 AM - C_adipose 


16.7 


97478 JPatient-07pLplacenta 


9.3 


94712J)onor 2 AD - A^adipose 


55,9 


99167_Bayer Patient 1 


100.0 


94713_Donor 2 AD - B_adipose 


74.7 


97482JPatient-08uUiterus 


11.0 


94714JDonor 2 AD - C_adipose 


54.7 


97483 JPatient-08pl_pIacenta 


7.9 


94742_Donor 3 U - A ^Mesenchymal 
Stem Cells 


5.7 


97486JPatient-09sleskeletal 
muscle 


9.9 


94743JDonor 3 U - B_Mesenchymal 
Stem Cells 


8.0 


97487_Patient-09ut_uterus 


4.1 


94730_Donor 3 AM - A_adipose 


8.3 


97488 JPatient-09pLplacenta 


10.3 


94731_Donor 3 AM - B_adipose 


14.3 


97492JPatient-10ut_uterus 


10.2 


94732_Donor 3 AM - C_adipose 


11 .3 


QIAJQ^ Patient- "lOnl nlar^tita 


20.9 


9473^ Donor 3 AD - A adinose 


30.1 


97495__Patient-l lgo_adipose 


5.8 


94734_Donor 3 AD - B_adipose 


22.5 


y 1 4y O.Jr aueul- 1 1 SA_.SK.eie 131 

muscle 


4.4 


94735_Donor 3 AD - C_adipose 


7.5 


97497 J'atient-lluLuteras 


13.5 


77138JLiver_HepG2untreated 


2.5 


97498 JPatient-1 IpLplacerita 


3.4 


73556 JHeart_Cardiac stromal cells 
(primary) 


2.7 


97500_Patient-12go_adipose 


37.1 


81735_SmalI Intestine 


12.6 


97501 JPatient-12sk_skeletal 
muscle 


20.2 


72409 J&dney Jftoximal Convoluted 
Tubule 


28.1 


97502 - Patient-12ut_uteras 


22.8 


82685_SmaU intestine_Duodenum 


24.0 


97503 JPatient-12pLplacenta 


13.1 


90650„AdrenaLAdrenocortical 
adenoma 


7.3 


94721_Donor2U- 
A_Mesenchymal Stem Cells 


87.7 


72410JKjdneyJHRCE 


33.0 


94722 J)onor2U- 
B_Mesenchymal Stem Cells 


75.8 


7241LKidney_HRE 


10.4 


94723JDonor2U- 
C_Mesenchymal Stem Cells 


77.9 


73 139 JJtemsJJterine smooth . 
muscle cells 


11.8 
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CNS.neurodegenerationjtf.O Summary: Ag5630 This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 
However, no differential expression of this gene was detected between Alzheimer's 
diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1.5 for a discussion of this gene in treatment of central nervous system disorders. 

GeneraLscreening_paneI_vl.5 Summary: Ag5630 Higest expression of this 
gene is detected in a gastric cancer NCI-N87 cell line (CT=27.6). Moderate levels of 
expression of this gene is also seen in cluster of cancer cell lines derived from pancreatic, 
gastric, colon, lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, 
melanoma and brain cancers. Thus, expression of this gene could be used as a marker to 
detect the presence of these cancers. Furthermore, therapeutic modulation of the expression 
or function of this gene may be effective in the treatment of pancreatic, gastric, colon, lung, 
liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain 
cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at 
moderate levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal 
muscle, heart, liver and the gastrointestinal tract. Therefore, therapeutic modulation of the 
activity of this gene may prove useful in the treatment of endocrine/metabolically related 
diseases, such as obesity and diabetes. 

In addition, this gene is expressed at moderate levels in all regions of the central 
nervous system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal coid. Therefore, therapeutic modulation of this gene 
product may be useful in the treatment of central nervous system disorders such as 
Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

Panel 4.1D Summary: Ag5630 Higest expression of this gene is detected in LPS 
treated monocytes (CT=29.7). Interestingly, this gene is expressed at much higher levels in 
LPS activated when compared to resting monocytes (CT=40). This observation suggests 
that expression of this gene can be used to distinguish actvated from resting monocytes. In 
addition, upon activation monocytes contribute to the innate and specific immunity by 
migrating to the site of tissue injury and releasing inflammatory cytokines. This release 
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contributes to the inflammation process. Therefore, modulattibn 61" ffi'e^&Ss^offbftBe' * " 
protein encoded by this gene may prevent the recruitment of monocytes and the initiation 
of the inflammatory process. 

In addition, this gene is also expressed at moderate to low levels in activated 
polarized T cells, naive and memory T cells, resting and activated LAK cells, resting IL-2 
treated NK cells, two way MLR, activated PBMC cells and B lymphocytes, dendritic cells, 
macrophage, different endothelial cells, bronchial and small airway epithelium, astrocytes, 
basophils, keratinocytes, mucoepi dermoid cells, lung and dermal fibroblasts, neutrophils 
and kidney. Therefore, modulation of the gene product with a functional therapeutic may 
lead to the alteration of functions associated with these cell types and lead to improvement 
of the symptoms of patients suffering from autoimmune and inflammatory diseases such as 
asthma, allergies, inflammatory bowel disease, lupus erythematosus, psoriasis, rheumatoid 
arthritis, and osteoarthritis. 

Panel 5 Islet Summary: Ag5630 Higest expression of this gene is detected in beta 
islet cells (CT=26.7). In addition, this gene shows widespread expression in this panel, with 
moderate to low expressions in adipose, placenta, uterus, skeletal muscle, kidney, and small 
intestine samples. Therefore, therapeutic modulation of this gene may be useful in the 
treatment of metabolic/endocrine disorders including, obesity, Type I and H diabetes. 

AC. CG149350-01 and CG149350-02: Vacuolar ATP synthase 
subunit F. 



25 



Expression of gene CG149350-01 and CG149350-02 was assessed using the 
primer-probe set Ag7581, described in Table ACA. Results of the RTQ-PCR runs are 
shown in Table ACB. Please note that CG149350-02 represents a full-length physical clone 
of the CG149350-01 gene, validating the prediction of the gene sequence. 

Table ACA. Probe Name Ag7581 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' ~aagaactgccaccccaatt-3 1 


19 


88 


351 


Probe 


TET-5 » -cattgatggtcgtatccttctccacc 
a-3 ' -TAMRA 


27 


113 


352 


Reverse 


5 1 -aaattgccggaaagtgtctt-3 1 


20 


146 


353 
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Table ACS. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag758I, 

Pun 
XV tin 

308752174 


issue Name 


ReL 

Exp.(%) 
Ag7581, 
Kun 

308752174 


AD 1 Hippo 


19.9 


Control (Path) 3 Temporal Ctx 


7.3 


AD 2 Hippo 


21.3 


Control (Path) 4 Temporal Ctx 


62.9 


AD 3 Hippo 


14.9 


AD 1 Occipital Ctx 


19.1 


AD 4 Hippo 


6.4 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


65.5 


AD 3 Occipital Ctx 


22.4 


AD 6 Hippo 


44.4 


AD 4 Occipital Ctx 


32.3 


Control 2 Hippo 


21.9 


AD 5 Occipital Ctx 


4.4 


Control 4 Hippo 


30.6 


AD 6 Occipital Ctx 


20.2 


Control (Path) 3 Hippo 


10.7 


Control 1 Occipital Ctx 


3.0 


AD 1 Temporal Ctx 


23.0 


Control 2 Occipital Ctx 


35.6 


AD 2 Temporal Ctx 


27.5 


Control 3 Occipital Ctx 


53.2 


AD 3 Temporal Ctx 


19.8 


Control 4 Occipital Ctx 


6.8 


AD 4 Temporal Ctx j 


21.3 


Control (Path) 1 Occioital Ctx 


70.7 


AD 5 Inf Temporal Ctx 


^6.3 


Control (Path) 2 Occipital Ctx 


17.9 


AD 5 SupTemporal Ctx 


55.9 


Control (Path) 3 Occipital Ctx 


4.2 


AD 6 Inf Temporal Ctx 


52.9 


Control (Path) 4 Occipital Ctx 


32.5 


AD 6 Sup Temporal Ctx 


47.3 


Control 1 Parietal Ctx 


8.7 


Control 1 Temporal Ctx 


23.5 


Control 2 Parietal Ctx 


56.3 


Control 2 Temporal Ctx 


28.9 


Control 3 Parietal Ctx 


32.5 


Control 3 Temporal Ctx 


22.2 


Control (Path) 1 Parietal Ctx 


100.0 


Control 4 Temporal Ctx 


9.1 


Control (Path) 2 Parietal Ctx 


38.4 


Control (Path) 1 Temporal Ctx 


45.7 


Control (Path) 3 Parietal Ctx 


17.6 


Control (Path) 2 Temporal Ctx |62.0 


Control (Path) 4 Parietal Ctx 


54.2 



CNSjneurodegeneration_vl.O Summary: Ag7581 No differential expression of 
this gene was detected between Alzheimer's diseased postmortem brains and those of 
non-demented controls in this experiment. However, this panel confirms the expression of 
this gene at low levels in the brains of an independent group of individuals. Therefore, 
therapeutic modulation of this gene product may be useful in the treatment of centra] 
nervous system disorders such as Parkinson's disease, epilepsy, multiple sclerosis, 
schizophrenia and depression. 
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AD. CG149536-01: PROTEIN-TYRoJMi feo^lffisE? * 3 " 7 3 
NON-RECEPTOR TYPE 2. 

Expression of gene CG149536-01 was assessed using the primer-probe sets Ag5255 
and Ag6844, described in Tables ADA and ADB. Results of the RTQ-PCR runs are shown 
5 in Tables ADC, ADD and ADE. 

Table ADA. Probe Name AsS255 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 '-cttatggtttggcagcagaa-3 1 


20 


355 


354 


Probe 


TET-5 ' -ccaaagcagttgtcatgctgaaccgc 
-3 ' -TAMRA 


26 


377 


355 


Reverse 


5 * -tggtttcaccactcgattct-3 * 


20 


414 


356 



10 

Table ADB. Probe Name Ag6844 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -agagaatcgagtggtgaaacc-3 1 


21 


412 


357 


Probe 


TET-5 ■ -actacctggccagattttggagtccc 
-3 1 -TAMRA 


26 


457 


358 


Reverse 


5 ■ -aggagccagattctctcacttta-3 1 


23 


516 


359 



15 

Table ADC. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag5255, 
Run 

229929883 


issue Name 


Rel. 

Exp.(%) 
Ag5255, 
Run 

229929883 


AD 1 Hippo 


28.9 


Control (Path) 3 Temporal Ctx 


21.0 


AD 2 Hippo 


42.3 


Control (Path) 4 Temporal Ctx 


38.7 


AD 3 Hippo 


42.0 


AD 1 Occipital Ctx 


45.4 


AD 4 Hippo 


5.9 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


92.7 


AD 3 Occipital Ctx 


36.9 


AD 6 Hippo 


29.7 


AD 4 Occipital Ctx 


23.5 


Control 2 Hippo 


52.5 


AD 5 Occipital Ctx 


13.6 


Control 4 Hippo 


22.4 


AD 6 Occipital Ctx 


47.6 


Control (Path) 3 Hippo 


17.9 


Control 1 Occipital Ctx 


3.2 
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AD 1 Temporal Ctx 


39.5 


Contn^S^^ 


AD 2 Temporal Ctx 


56.3 


Control 3 Occipital Ctx |31.2 


AD 3 Temporal Ctx 


233 


Control 4 Occipital Ctx |5.0 


AD 4 Temporal Ctx 


10.9 


Control (Path) 1 Occipital Ctx 199.3 


AD 5 Inf Temporal Ctx 


44.8 


Control (Path) 2 Occipital Ctx |40.3 


AD 5 SupTemporal Ctx 


53.2 


Control (Path) 3 Occipital Ctx jO.O 


AD 6 Inf Temporal Ctx 


68.8 


Control (Path) 4 Occipital Ctx j24.0 


AD 6 Sup Temporal Ctx 


100.0 


Control 1 Parietal Ctx ]20.6 


Control 1 Temporal Ctx 


13.4 


Control 2 Parietal Ctx 


68.3 


Control 2 Temporal Ctx 


34.4 


Control 3 Parietal Ctx 


29.5 


Control 3 Temporal Ctx 


84.1 


Control (Path) 1 Parietal Ctx 


46.3 


Control 4 Temporal Ctx 


18.4 


Control (Path) 2 Parietal Ctx 


31.2 


Control (Path) 1 Temporal Ctx 


41.2 


Control (Path) 3 Parietal Ctx 


6.9 


Control (Path) 2 Temporal Ctx 


58.6 


Control (Path) 4 Parietal Ctx 


45.1 



Table ADD. General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5255, 
Run 

230218532 


issue Name 


ReL 

Exp.(%) 
Ag5255, 
Run 

230218532 


Adipose 


6.4 


Renal ca. TK-10 


18.8 


Melanoma* Hs688(A).T 


9.5 


Bladder 


10.8 


Melanoma* Hs688(B).T 


8.7 


Gastric ca. (liver met.) NCI-N87 


50.3 


Melanoma* M14 


19.1 


Gastric ca. KATO HI 


60.3 


Melanoma* LOXIMVI 


25.5 


Colon ca. SW-948 


5.8 


Melanoma* SK-MEL-5 


18.8 


Colon ca. SW480 


100.0 


Squamous cell carcinoma SCC-4 


24.0 


Colon ca.* (SW480 met) SW620 


23.3 


Testis Pool 


2.2 


Colon ca. HT29 


19.2 


Prostate ca.* (bone met) PC-3 


33.9 


Colon ca. HCT-116 


46.7 


Prostate Pool 


4.1 


Colon ca. CaCo-2 


49.3 


Placenta 


1.9 


Colon cancer tissue 


5.7 


Uterus Pool 


2.3 


Colon ca.SW1116 


3.5 


Ovarian ca. OVCAR-3 


19.6 


Colon ca. Colo-205 


3.3 


Ovarian ca. SK-OV-3 


55.5 


Colon ca. SW-48 


0.5 


Ovarian ca. OVCAR-4 


8.5 


Colon Pool 


5.9 


Ovarian ca. OVCAR-5 


44.4 


Small Intestine Pool 


5.7 


Ovarian ca. IGROV-1 


5.7 


Stomach Pool 


3.2 


Ovarian ca. OVCAR-8 


7.8 


Bone Marrow Pool 


2.8 


Ovary 


8.0 


Fetal Heart 


3.7 


Breast ca. MCF-7 


38.2 


Heart Pool 


0.7 


Breast ca. MDA-MB-231 


13.4 


Lymph Node Pool 


4.1 
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Breast ca. BT 549 


SI 8 


iFetalSkeletFfefe 1 '' 0002 ' 
jreiai oKeietaj iviuscie 




Breast ca. T47D 


5.4 


Skeletal Muscle Pool 


2.6 


Breast ca. MDA-N 


7.0 


Spleen Pool 


A A 

0.4 


Breast Pool 


9.0 


Thymus Pool 


19.2 


Trachea 


1.0 


CNS cancer (glio/astro) U87-MG 


26.4 


Lung 


5.7 


CNS cancer (glio/astro) U-118-MG 


33.2 


Fetal Lung 


17.1 


CNS cancer (neuro;met) SK-N-AS 


18.9 


Lung ca. NCI-N417 


1.0 CNS cancer (astro) SF-539 


17.1 


Lung ca. LX-1 


12.6 


CNS cancer (astro) SNB-75 


12.2 


Lungca.NCI-H146 |l6.6 


CNS cancer (glio)SNB-19 


6.4 


Lungca. SHP-77 


34.6 


CNS cancer (glio) SF-295 


16.0 


Lung ca. A549 


15.1 


Brain (Amygdala) Pool 


4.0 1 


Lungca. NCI-H526 


6.7 


Brain (cerebellum) 


33.2 


Lungca.NCI-H23 


33.0 


Brain (fetal) 


54.0 


Lungca.NCI-H460 


7.2 


Brain (Hippocampus) Pool 


4.7 


Lungca.HOP-62 


26.2 


Cerebral Cortex Pool 


5.3 


Lungca. NCI-H522 


35.1 ' 


Brain (Substantia nigra) Pool 


4.0 


Liver 


0.9 


Brain (Thalamus) Pool 


6.8 


Fetal Liver 


7.2 


Brain (whole) 


4.9 


Liver ca. HepG2 


9.7 


Spinal Cord Pool 


70 


Kidney Pool 


7.3 


Adrenal Gland 


2.4 


Fetal Kidney 


16.3 


Pituitary gland Pool 


2.1 


Renal ca. 786-0 


7.1 


Salivary Gland 


1.5 


Renal ca. A498 


2.2 


rhyroid (female) 


1.1 


Renal ca. ACHN 


?.2 ] 


Pancreatic ca. CAPAN2 < 


56.4 


Renal ca. UO-31 {6.5 Pancreas Pool 


1.2 



Table APE. Panel 4.1D 



Tissue Name 


Rel. 

Exp.(%) 

g5255, 

Run 

229851730 


ReL 

Exp.(%) 
Ag6844, 
Run 

279029113 


Tissue Name 


Rel. 

Exp.(%) 
Ag5255, 
Run 

229851730 


Rel. 

Exp.(%) 
Ag6844, 
Run 

279029113 


Secondary Thl act 


39.0 


38.7 


HUVEC DL-lbeta 


39.8 


9.6 


Secondary Th2 act 


46.7 


55.9 


HUVECIFN gamma 


12.5 


15.9 


Secondary Trl act 


15.7 


18.9 


HUVEC TNF alpha + 
IFN gamma 


21.0 


8.4 


Secondary Thl rest 


12.0 


3.9 


HUVEC TNF alpha + 
IL4 


12.1 


11.0 


Secondary Th2 rest 


0.0 


5.3 


HUVEC IL-11 


13.6 


4.4 


Secondary Trl rest 


0.0 


9.2 


Lung Microvascular 
EC none 


25.2 


18.4 
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Primary Thl act 


17.9 


6.0 


Lung Microvascular 
EC TNFalpha + 
IL-lbeta 


2.6 


9.4 


Primary Th2 act 


15.0 


33.7 


Microvascular 
Dermal EC none 


6.0 


3.8 


Primary Trl act 


18.2 


22.7 


Microsvasular 
Dermal EC 
TNFalpha + IL-lbeta 


0.0 


3.7 


Primary Thl rest 


0.0 


1.9 


Bronchial epithelium 
TNFalpha +BL1 beta 


9.3 


10.2 


Primary Th2 rest 


5.0 


1.5 


Small airway 
epithelium none 


0.0 


10.0 


Primary Trl rest 


0.0 


0.0 


Small airway 
epithelium TNFalpha 
+ IL-lbeta 


37.1 


14.1 


CD45RA CD4 
lymphocyte act 


32.1 


13.9 


Coronery artery SMC 
rest 


11.1 


5.5 


CD45RO CD4 
lymphocyte act 


58.6 


42.9 


Coronery artery SMC 
TNFalpha + IL-lbeta 


11.3 


4.0 


CD8 lymphocyte act 


5.2 


18.7 


Astrocytes rest 


0.0 


1.1 


Secondary CDS 
lymphocyte rest 


10.9 


5.5 


Astrocytes TNFalpha 
+ lL-ibeta 


0.0 


1.8 


Secondary CD8 
lymphocyte act 


0.0 


4.4 


KU-812 (Basophil) 
rest 


38.4 


17.2 


CD4 lymphocyte none 


6.7 


3.4 


l\U"Ol6 ^XJCU>U£/11JU/ 

PMA/ionomycin 


33.2 


38.7 


2ry 

Thl/Th2^rLanti-CD95 
CH11 


0.0 


26.4 


CCD 1106 

(Keratinocytes) none 


76.3 


40.1 


LAK cells rest 


19.1 


14.7 


CCD1106 
(Keratinocytes) 
TNFalpha + IL-lbeta 


13.1 


14.9 


LAK cells EL-2 


5.4 


7.3 


Liver cirrhosis 


15.8 


7.0 


LAK cells IL-2+IL-12 


7.9 


1.0 


NCI-H292none 


35.1 


20.2 


LAK cells IL-2+IFN 
gamma 


16.2 


77 

f ■ / 


NCI-H292 IT -4 


45 4 


25 5 


LAK cells IL-2+1L-18 


5.1 


8.0 


NCI-H292IL-9 


60.7 


31.2 


LAK cells 
PMA/ionomycin 


27.9 


40.9 


NCI-H292 IL-13 

11 V/Jl JLX' tjr JUL-* X *J 


45.4 


384 


NK Cells IL-2 rest 


27.9 


40.3 


NCI-H292IFN 
gamma 


26.2 


167 


Two Way MLR 3 day 


18.2 


27.0 


HPAEC none 


5.6 


5.3 


Two Way MLR 5 day 


23.3 


2.1 


HPAEC TNF alpha + 
EL-1 beta 


21.5 


12.1 


Two Way MLR 7 day 


4.5 


1.7 


Lung fibroblast none 


22.5 


\22 


PBMCrest 


3.2 


5.4 


Lung fibroblast TNF 
alpha + IL-1 beta 


6.3 


12 


PBMCPWM 


20.6 


9.8 


Lung fibroblast 1L-4 


16.0 


13i 
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PBMCPHA-L 


21.6 


12. 1 


Lung fibroblast IL-9 






Ramos (B cell) none 


403 


4.8 


Lung fibroblast IL-13 


0.0 


5.8 


Ramos (B cell) 
ionomycin 


31.6 


17.7 


Lung fibroblast 3FN 
gamma 


37.6 


19.9 


i> lymphocytes PWM 


26.6 


6.0 


Dermal fibroblast 
CCD1070rest 


32.3 


17.2 


B lymphocytes CD40L 
and IL-4 


4.8 


37.6 


Dermal fibroblast 
CCD1070 TNF alpha 


100.0 


54.7 


bUJL-1 a be AMP 


62.9 


74.2 


Dermal fibroblast 
CCD1070IL-1 beta 


34.6 


18.7 


EOL-1 dbcAMP 
PMA/ionomycin 


45.4 


15.1 


Dermal fibroblast 
IFN gamma 


17.1 


12.7 


Denrlrifif* ppIIq nrvnp 


7 




Dermal fibroblast 
IL-4 


j.i 


15.U 


Dendritic cells LPS 


21.0 


15.2 


Dermal Fibroblasts 
rest 


0.0 


6.9 


Dendritic cells 
anti-CD40 


10.2 


7.3 


Neutrophils 
TNFa+LPS 


0.0 


2.7 


Monocytes rest 


4.3 


32.1 


Neutrophils rest 


5.6 


6.1 


Monocytes LPS 


69.7 


100.0 


Colon 


0.0 


0.9 


Macrophages rest 


17.0 


3.8 


Lung 


0.0 


1.7 


Macrophages LPS 


0.0 


9.3 


Thymus 


15.2 


18.2 


HUVEC none 


5.9 |28.7 


Kidney 


6.3 


8.7 


HUVEC starved 


28. 1 J8.5 









AI_comprehensive panel_vl.O Summary: Ag5255 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

CNS_neurodegeneration_vl.O Summary: Ag5255 This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 
However, no differential expression of this gene was detected between Alzheimer's 
diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1.5 for a discussion of this gene in treatment of central nervous system disorders. 

General_screemng_panel_vl.5 Summary: Ag5255 Highest expression of this 
gene is detected in a colon cancer SW480 cell line (CT=31.6). Moderate to low levels of 
expression of this gene is also seen in cluster of cancer cell lines derived from pancreatic, 
gastric, colon, lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, 
melanoma and brain cancers. Thus, expression of this gene could be used as a marker to 
detect the presence of these cancers. Furthermore, therapeutic modulation of the expression 
or function of this gene may be effective in the treatment of pancreatic, gastric, colon, lung, 
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liver, renal, breast, ovarian, prostate, squamous cell carciKofabM^^WW^fffifi 11 
cancers. 

In addition, this gene is expressed at moderate levels in cerebellum and fetal brain. 
Therefore, therapeutic modulation of this gene product may be useful in the treatment of 
central nervous system disorders such ataxia and autism. 

Pane! 4.1D Summary: Ag5255/Ag6844 Two experiments with different probe 
and primer sets are in good agreement. The highest expression of this gene is detected in 
TNF alpha activated dermal fibroblast and LPS activated monocytes (CTs=32.7-32.9). 
Moderate to low levels of expression of this gene is detected in activated polarized T cells, 
naive and memory T cells, PMA/ionomycin activated LAK cells, resting IL-2 treated NK 
cells, eosinophils, resting dendritic cells, activated basophils, resting keratinocyte, and 
activated mucoepidermoid NCI-H292 cells. Therefore, therapeutic modulation of this gene 
or its protein product may be useful in the treatment of autoimmune and inflammatory 
-diseases such as asthma, allergies, inflammatory bowel disease, lupus erythematosus, 
psoriasis, rheumatoid arthritis, and osteoarthritis. 

AE. CG149964-01: Brain mitochondrial carrier protein-l. 

Expression of gene CG149964-01 was assessed using the primer-probe set Ag7056, 
described in Table AEA. 

Table AEA. Probe Name Ag7056 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -tgtggttccaactgctcag-3 » 


19 


617 


360 


Probe 


TET-5 1 -ctggtagctctactcctacaacgatgg 
cag-3 ' -TAMRA 


30 


640 


361 


Reverse 


5 ' -agafcccacatgtcccatcatfc-3 1 


21 


707 


362 



General_screening_panel_vl.6 Summary: Ag7056 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

AF. CG150799-01, CG150799-02 and CG150799-03: MASS1. 

Expression of gene CG150799-01, CG150799-02 and CG150799-03 was assessed 
using the primer-probe sets Ag5242, Ag5243, Ag5244, Ag5245, Ag5247 and Ag5248, 
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described in Tables AFA, AFB and AFC. Results of the M^-iM^iPJ^howl ill • 
Tables AFD, AFE, AFF, AFG, AFH and AFI. Please note that probe-primer sets Ag5243 is 
specific for CG150799-02 and probe-primer sets Ag5244 and Ag5245 are specific for 
CG150799-03. 

Table AFA. Probe Name Ag5242 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -acgaatcccatgtgacacttt-3 1 


21 


3624 


363 


Probe 


TET-5 ' -cccttcattataaaaccttgggttcc 
a- 3 ' -TAMRA 


27 


3645 


364 


Reverse 


5 ' -tgactgttgtcttggcaatgt-3 ' 


21 


3681 


365 



Table AFB. Probe Name Ae5243 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gactccttccaaaggctatattgt-3 1 


24 


8809 


366 


Probe 


TET-5 ■ -cgattcaaggccctacaaatatctgcc 
a- 3 ' -TAMRA. 


28 


8849 


367 


Reverse 


5 r -ccatfctctggttccgtgtcta-3 1 


21 


8880 


368 



Table AFC 

Probe Name g5244 



Primers 


Sequences 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -actgataattctattcctgaactgga-3 ? 


26 


4927 


369 


Probe 


TET-5 • -agctctgctagatctatctacagatataac 
gctgtaaaatc-3 ■ -TAMRA 


41 


4992 


370 


Reverse 


5 1 -aactcattatagatcatccaaaagga-3 1 


26 


5036 


371 



Table AFD. 

Probe Name g5245 



Primers 


Sequences 


Length 


Start 
Position 


SEQID 
No 
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5 



Forward 


5 ' -accttgttgatgactttgctaatg-3 ' * *~|24 


&' Jt 


41 .....(( ( 


Probe 


TET-5 1 -cagtggaactattacattccttccttgg 
caga-3 1 -TAMRA j 


4345 


373 


Reverse 


5 1 -ggaagcgacacttcaatcaaa-3 1 J21 


4387 


374 


Table AFE. Probe Name Ae5247 


Primers 


Sequenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -acttacgttggacttaccatgg-3 1 


22 


8183 


375 


Probe 


TET-5 1 -caaqttcatttcctcccagactaggtat 
gagg-3 ■ -TAMRA 


32 


8211 


376 


Reverse 


5 1 -tcatttcatttgaagtgagcaa-3 ■ 


22 


8263 


377 



Table AFF. Probe Name Ag5248 

10 



Primers 


Sequenes 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -accttgttgatgactttgctaatg-3 1 


24 


4320 


378 


Probe 


TET-5 ' -cagtggaactattacattccttccttgg 
caga-3 1 -TAMRA 


32 


4345 


379 


Reverse 


5 ' -caagaacatatatattcagaacctctgatc-3 
i 


30 


4377 


380 



Table AFG. AI_comprehensive panel vl.O 

15 



Tissue Name 


Rel. 

Exp.(%) 
Ag5242, 
Run 

305464510 


issue Name 


Rel. 

Exp.(%) 
Ag5242, 
Run 

305464510 


110967 COPD-F 


0.1 


112427 Match Control Psoriasis-F 


2.3 


110980 COPD-F 


1.1 


112418 Psoriasis-M 


0.1 


110968 COPD-M 


0.1 


112723 Match Control Psoriasis-M 


0.5 


110977 COPD-M 


4.4 


112419 Psoriasis-M 


0.0 


1 10989 Emphysema-F 


0.2 


112424 Match Control Psoriasis-M 


0.2 


1 10992 Emphysema-F 


2.7 


112420 Psoriasis-M 


1.8 


110993 Emphysema-F 


0.1 


1 12425 Match Control Psoriasis-M 


3.7 


1 10994 Emphysema-F 


0.1 


104689 (MF) OA Bone-Backus 


0.2 


110995 Emphysema-F 


6.8 


104690 (MF) Adj "Normal" 
Bone-Backus 


0.6 
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1 10996 Emphysema-F 


2.0 


104691 (MF)OAT§yno\aum-Backus 


n ■■3:1375 

0.1 


1 10997 Asthma-M 


0.1 


104692 (BA) OA Cartilage-Backus 


0.0 


111001 Asthma-F 


0.5 


104694 (BA) OA Bone-Backus 


0.2 


111002 Asthma-F 


0.9 


104695 (BA) Adj "Normal" 
Bone-Backus 


0.4 


t 1 1 AOO A t.~. * ^ A tt_ T"» 

1 1 1003 Atopic Asthma-F 


1.5 


104696 (BA) OA Synovium-Backus 


0.1 


111004 Atopic Asthma-F 


6.1 


104700 (SS) OA Bone-Backus 


0.9 


111005 Atopic Asthma-F 


2.5 


104701 (SS) Adj "Normal" 
x> one-Jo acKus 


0-6 1 


111006 Atonir Asthma-P? 


0 0 


iU4/uz loo; UA oynovium-xJacKus 


0.2 


111417 AlWfrv-M 


l/.O 


i i two ua cartilage Kepv 


0.9 


112347 Allenrv-M 


Vf.V 


i izo/z ua uoneD 


0.0 


1 12349 Nftrm^l T nna-"R 




iizo/J UAoynoviurrD 


0.1 i 


112357 Normal T nn^-F 


1 0 
l.U 


lizo/^f ua oynovial jpiuid cells!) 


0.2 1 


112354 Normal Lung-M 


0.7 


117100 OA Cartilage Repl4 


o.o j 


112374 Crohns-F 


0.5 


112756 OA Bone9 


100.0 j 


112389 Match Control Crohns-F 


0.2 


112757 OA Synovium9 


6.4 j 


112375 Crohns-F 


0.1 


112758 OA Synovial Fluid Cells9 


0.1 | 


1 12732 Match Control Crohns-F 


0.3 


117125 RA Cartilage Rep2 


0.0 I 


112725 Crohns-M 


0.1 


113492 Bone2 RA 


31.6 j 


112387 Match Control 
Crohns-M 


n 1 

U.l 


iiJ4yj oynoviumz KA 


11.8 1 


112378 Crohns-M 


0.0 


113494 Syn Fluid CeUsRA 


22.2 


112390 Match Control 
Crohns-M 


i *; 
i.j 


lio^+yy uartiiage4 KA 


22.7 


112726 Crohns-M 


1.2 


1 13500 Bone4RA 


28.1 | 


112731 Match Control 
Crohns-M 




11JDU1 oynovium4KA 


20.2 


112380 Ulcer Col-F 


1.0 


113502 Syn Fluid Cells4RA 


16.4 


112734 Match Control Ulcer 
Col-F 


0 R 
u.o 


iiooyj cartilage.? ka 


22.7 


112384 Ulcer Col-F 


3.7 


113496 Bone3 RA 


24.5 "j 


112737 Match Control Ulcer 
Col-F 




no^y i synoviums ka 


14.7 f 


112386 Ulcer Col-F 


0.2 


1 13498 Syn Fluid Cells3 RA 


J3.0 | 


1 12738 Match Control Ulcer 
Col-F 




ii/ luo iNormai v^aruiage Kepzu i 


J.O 1 


112381 Ulcer Col-M 


10 


113663 Bone3 Normal ( 


).0 J 


1 12735 Match Control Ulcer 
Col-M 


).o 


[ 13664 Synovium3 Normal ( 


).0 


112382 Ulcer Col-M 


).3 ] 


i 13665 Syn Fluid Cells3 Normal ( 


).0 | 


1 12394 Match Control Ulcer . 
Col-M 1 


).l 1 


1 17107 Normal Cartilage Rep22 C 


).l 


112383 Ulcer Col-M <■ 


1.5 ] 


13667 Bone4 Normal C 




112736 Match Control Ulcer , 
Col-M 1 


).3 1 


1 3668 Synovium4 Normal ( 


U 
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|1 12423 Psoriasis-F {0.2 |l 13669 Syn pfedhCteh^ MmaT 11 ^ joif 



hat - as^ JI B 



Table AFH.CNS neurodegeneration vl.O 



Tiss 
ue 
Na 
me 


Rel. 
Exp.< 

of \ 

'Ag52 
42, 
Run 
2296 
6154 
6 


Rel. 

[Exp. 

(%) 

Ag5 
242, 
Run 

0987 
6 


Rel. 
Exp. 

/tit \ 
(%) 

Ag52 
43, 
Run 
2296 

6154 
7 


Rel. 
Exp. 

(%) 

' Ag5 
243, 
Run 
2768 

6356 
6 


Rel. 

Exp. 

(%) 

Ag52 

43, 

Run 

ml / / 

3146 
0 


Rel. 
Exp.( 

%) 
' Ag524 

4, 

Run 

22966 

1548 


Rel. 
Exp. 

, AgS 
244, 
Run 

1076 
2 


Rel. 
Exp. 
(%) 
!Ag52 
44, 
Run 
nil 
3146 
1 


jRel. 
jExp. 
(%) 
*• Ag5 
245, 
Run 

6154 
9 


Rel. 

Exp. 

(%) 

Ag52 

45, 

Run 

1032 
0 


Rel. 
Exp. 
(%) 
!Ag5 
247, 
Run 
2296 
6155 
0 


Rel. 

Exp. 

(%) 

Ag52 

47, 

Run 

2768 

6357 
0__ 


Rel. 
Exp. 
(%) 
• AgS 
248, 
Run 
2296 
6155 
1 


Rel. 
Exp. 

(%) 

Ag52 

48, 

Run 

2768 

6357 

2 


Rel. 
Exp.( 

%) 

Ag52 

48, 

Run 

2777 

3146 

6 


AD 
1 

Hip 
po 


22.4 


21.6 


29.3 


31.6 


27.5 


9.1 


0.0 


3.1 


16.0 


0.0 


9.0 


6.7 


14.9 


13.7 


17.9 


AD 

Hip 
po 


47.3 


42.0 


54.7 


53.2 


44.8 


0.0 


2.9 


4.0 


16.2 


4.6 


41.8 


21.8 


44.4 


32.8 


32.5 


AD 
3 

Hip 
po 


12.2 


13.5 


17.8 


13.6 


10.9 


0.0 


0.0 


0.0 


0.0 


0.0 


5.8 


0.0 


9.8 


4.8 


6.8 


AD 
4 

Hip 

PO 

AD 

5 

Hip 
po 


14.8 


14.4 


16.6 


17.7 


20.6 


0.0 


0.0 


0.0 


23.2 


7.6 


17.3 


8.6 


12.8 


6.4 


7.0 


65.5 


84.1 


61.6 


63.7 


57.4 


6.7 


0.0 


4.3 


11.6 


5.3 


84.7 


31.0 


85.3 


61.1 


62.0 


AD 
6 

Hip 
po 


56.3 


59.5 


82.4 


84.7 


90.1 


74.2 


57.8 


51.8 


58.6 


30.8 


100. 
0 


92.0 


69.3 


48.3 


55.5 


Con 
trol 
2 

Hip 
po 


29.5 


25.7 


29.3 


31.6 


31.9 


5.0 


3.0 


5.5 


15.1 


29.1 


»2.0 


29.9 : 


27.7 


25.0 


26.1 


Con 
trol 

4 : 

Hip 
po 


52.8 : 


29.7 : 


$5.6 : 


$1.2 : 


37.1 i 


$.1 


11.3 


).0 


3.0 ( 


).o : 


17.0 : 


>3.7 5 


»5.2 ; 


10.9 1 


16.5 
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PC - T-Z '- JL i 



Con 
trol 
(Pat 
h)3 
Hip 
po 



AD 
1 

Te 
mp 
oral 
Ctx 



AD 
2 

Te 

mp 

oral 

Ctx 

AD 

3 

Te 
mp 
oral 
Ctx 



33.9 



32.3 



35.8 



AD 

4 

Te 
mp 
oral 
Ctx 



AD 
5 

mf 

Te 

mp 

oral 

Ctx 



28.3 



33.9 



33.9 



42.3 



21.2 



47.3 



73.7 



44.8 



24.7 



32.3 



39.5 



20.4 



36.6 



24.0 



34.6 



30.8 



35.4 



51.1 



23.5 



39.0 



100. 
0 



100.0 



100. 
0 



46.3 



0.0 



2.0 



3.3 



4.5 0.0 



5.4 



20.7 



45.4 



100.0 0.0 



0.0 



10.3 



5.4 



8.2 



2.9 



0.0 



0.0 



0.0 8.3 



0.0 



11.4 



9.3 



14.2 



0.0 



39.0 



17.6 



0.0 



13.0 



12.2 



13.1 



19.5 29.9 



21.9 



0.0 



0.0 



19.2 



24.5 



0.0 



28.7 



32.5 



4.4 



43.5 



43.5 



4.5 



25.3 



26.2 



38.2 



12.2 9.5 



19.8 



17.9 



37.6 



22.1 



26.1 



100.0 



33.0 



74.7 



76.8 



11.3 



25.9 



100.0 79.6 



29.5 



AD 
5 

Sup 

Te 

mp 

oral 

Ctx 



93.3 



77.4 



87.7 



82.4 



88.3 



7.3 



10.7 



8.9 



29.1 



3.3 



45.7 



59.0 



82.4 



70.7 



64.2 



AD 
6 

Jnf 

Te 

mp 

oral 

Ctx 



59.0 



58.2 



62.0 



0.0 



58.2 



55.9 



94.0 



49.0 



55.5 



100.0 87.1 



87.7 



7i.7 



46.3 



65.1 
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AD 
6 
Sup 
Te 
mp 
oral 
Ctx 

Con 

trol 

1 

Te 
mp 
oral 
Qx 

Con 

trol 

2 

Te 
mp 
oral 
Ctx 

Con 
trol 
3 

Te 
mp 
oral 
Ctx 

Con 
trol 



85.3 



99.3 



74.2 



74.7 



90.1 



100.0 



100. 



100.0 



99.3 



73.7 



95.9 



47.6 



46.3 



27.4 



28.5 



29.1 



1.7 10.0 



0.0 



58.2 



27.7 



25.3 



37.6 



37.4 



30.6 



27.5 



32.8 



2.7 11.0 



4.5 



31.4 



48.3 



8.9 



97.3 



97.3 



19.1 



44.4 



Mb 



60.3 



a?3 



94.0 



25.2 



32.3 



15.4 50.0 



27.5 



24.1 



27.4 



32.8 



37.6 



7.1 5.4 



2.6 



5.1 



6.3 



34.4 



16.6 



5.8 



Te 
mp 
oral 
Ctx 

Con 

trol 

(Pat 

h)l 

Te 

mp 

oral 

Ctx 

Con 

trol 

(Pat 

h)2 

Te 

mp 

oral 

Ctx i 



38.2 



39.0 



34.6 



30.6 



31.9 



8.7 82.6 



8.4 



0.0 



35.6 



31.4 



21.3 



29.1 



13.7 



66.0 



81.2 



54.0 



58.6 



52.5 



2.5 0.0 



78.5 



37.9 



43.5 



50.0 



40.1 



41.8 



41.5 



2.0 10.9 



0.0 



80.1 



20.9 



72.7 



42.6 



22.4 



72.7 



75.8 



31.9 



26.1 



63.3 



27.7 



31.4 



69.7 



42.9 



33.9 



42.0 
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Coi 
trol 
(Pa 
h)2 
Te 
mp 
oral 
Ctx 


i 
t 

5 23.3 


24.5 


19.9 


21.5 


22.4 


2.9 


2.3 


7.7 


0.0 


■PC 

4.1 


5.8 


4.3 


ia , 
21.0 


'. ~.s -a 
14.5 


16.2 


Lor 

trol 

(Pat 

h)4 

Te 

mp 

oral 

Ctx 


i 

52.5 


48.0 


33.7 


39.8 


39.0 


0.0 


4.7 


4.3 


49.3 


43.2 


73.7 


49.3 


40.6 


32.5 


47.6 


AD 
1 

Occ 
ipit 
al 

Ctx 


18.0 


18.8 


22.8 


25.7 


24.3 


0.0 


3.0 


0.0 


0.0 


0.0 


10.2 


13.8 


19.9 


12.8 


14.3 


AD 
2 

Occ 
ipit 
al 

Ctx 
(Mi 
ssw 

g) 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


AD 

3 

Occ 
ipit 

Ctx 


15.5 


14.0 


17.8 


17.8 | 


18.0 


0.0 


0.0 


0.0 


3.2 


0.0 


10.2 


10 


10.3 


5.2 


5.5 


i 
t 

< 

i 

£ 

( 


\D 
I 

Dec 
pit 
il 

:u 


17.3 ; 


13.7 : 


15.3 : 


£7.5 E 


»4.3 


5.3 • 


u 


$.3 : 


>8.7 < 


>.7 : 


22.2 : 


>3.0 ? 


5.6 1 


17.0 : 


>1.5 


k 
5 
( 

» 
a 


)cc „ 

pit 
:t*| 


2.4 2 


6.1 2 


.1.3 1 


5.2 |2 


2.5 2 


..0 3 


.1 5 


.3 2 


5.7 5 


.1 1 


6.7 


.7 3 


.3 2 


0.9.1 


8.4 
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AD 
6 
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19.6 
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al 
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h)2 
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al 

Ctx 


56.6 


64.6 


48.6 


58.6 


57.8 
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5.1 


7.8 


66.4 


30.6 


69.7 


57.0 


76.3 


42.6 


55.5 
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S.l ! 


10 


7.1 i 


J.5 


2.0 
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).o ; 


5.6 


3.0 : 


1.6 ( 


).0 
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§? c if, ' y s oa ./ a jl 3 :? 
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79.0 



83.5 



17.3 



62.4 



17.1 



68.3 



19.8 



57.8 



22.1 



15.3 



76.3 



0.0 



0.0 



0.0 



0.0 



0.0 



0.0 



0.0 



0.0 



11.7 



15.7 



0.0 



7.9 



17.0 



0.0 



70.2 



63.7 



21.0 



259 



4.2 



0.0 



5.1 



0.0 1.5 



7.2 



0.0 



23.8 



0.0 



9.7 



9.8 



3.8 



1.9 



10.4 



0.0 



0.0 



26.8 



1.8 1.7 



2.1 10.3 



3.6 



12.2 



0.0 0.0 



0.0 



0.0 



100. 
0 



30.8 



55.9 



0.0 



9.8 



39.0 



1.7 



16.4 



37.9 



7.2 



23.8 



100. 
0 



12.9 



41.5 



100.0 99.3 



17.9 



18.0 



6.3 



5.0 



16.3 



44.1 



8.5 



17.1 



633 



11.3 



53.2 71.2 



10.2 



15.8 
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re J, "u s aa , 'a i. a y a 



Con 

trol 

(Pat 

h)3 

Pari 

eta] 

Ctx 



Con 

trol 

(Pat 

h)4 

Pari 

eta] 

Ctx 



12.0 



30.1 



10.2 



25.5 



11.7 



26.1 



8.4 



25.7 



13.9 



29.1 



2.8 



5.3 0.0 



1.5 



0.0 



0.0 



6.3 



59.0 



0.0 



40.9 



3.9 



23.7 



0.0 



30.1 



3.2 



25.9 



4.9 



26.4 



4.2 



17.8 



Table AFI. General screening panel vl.5 



Tissue 
Name 


ReL 
Exp.( 

.%) 

Ag524 
2, 

xvun 

22966 

5046 


lUcl. 

Exp.( 

%) 

Ag52 

43, 

Kun 

22966 

5047 


Rel. 

Exp 

(%) 

Ag524 

5, 

Kun 

229665 

049 


Rel. 

Exp.( 
%) 

Ag524 
7, 

Kun 

229665 

052 


Rel. 

Exp.( 
%) 

Ag524 
8, 

Run 

22966 
5053 


Tissue 
Name 


Rel. 

Exp.( 

%) 

Ag524 
2, 

Run 

229665 

046 


ReL 
Exp.( 

%)". 

Ag524 

3, 

Run 

22966 

5047 


Rel. 

Exp.( 

%) 

Ag524 
5, 

Run 

22966 
5049 


Rel. 

Exp.( 

%) 

Ag524 
7, 

Run 

22966 
5052 


ReL 

Exp.( 

%) 

Ag52 

48, 

Run 

22966 

5053 


Adipose 


0.1 


0.0 


0.0 


0.0 


0.0 


Renal ca. 
TK-10 


0.0 


0.1 


0.0 


0.0 


0.0 


Melanoma 

* 

Hs688(A). 
T 


0.8 


0.5 


0.0 


0.0 


1.2 


Bladder 


2.6 


1.8 


0.0 


2.5 


3.7 


Melanoma 

Hs688(B). 
T 


0.1 


0.0 


0.0 


0.0 


0.0 


Gastric ca. 
(liver 
met) 
NCI-N87 


0.0 


0.0 


0.0 


0.0 


0.0 


Melanoma 
*M14 


0.2 


0.3 


0.0 


0.0 


0.1 


Gastric ca. 
KATOIH 


0.0 


0.0 


0.0 


0.0 


0.1 


Melanoma 

* 

LOXJMV 
I 


0.9 


0.2 


0.0 


0.0 


0.1 


Colon ca. 
SW-948 


5.2 


4.6 


0.4 


0.6 


3.7 


Melanoma 

* 

SK-MEL- 
5 


0.6 


1.6 


0.0 


1.2 


0.0 


Colon ca. 
SW480 


4.6 


3.7 


0.0 


1.1 


5.9 


Squamous 
cell 

carcinoma 
SCCM 


3.0 


D.0 


3.0 


3.0 


3.0 ( 
i 


Colon ca.* 
;SW480 
net) 
5W620 


3.1 1 


3.0 ( 


3.0 


3.0 


3.0 
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Testis 
Pool 


2.9 


3.4 


0.0 


3.3 


3.8 


Colon ca. 
HT29 


0.0 


0.0 


0.0 


0.0 


0.0 


Prostate 
ca.* (bone 
met) PC-3 


» 89.5 


86.5 


5.8 


18.2 


100.0 


Colon ca. 
HCT-116 


12.2 


11.9 


0.4 


4.4 


14.2 


Prostate 
Pool 


10.7 


8.5 


0.7 


2.4 


7.0 


Colon ca. 
CaCo-2 


13.8 


14.0 


8.9 


6.6 


16.4 


Placenta 


0.0 


0.1 


0.0 


1.0 


0 1 




Colon 
tissue 




u.u 


n n 
u.u 


U.U 


0.0 


Uterus 
Poo] 


0.0 


0.1 


0.0 


0.0 


0.1 


Colon ca. 

cwi 1 1fV 
D YV i. ± ID 


0.1 


0.0 


0.0 


0.0 


0.0 


Ovarian 
ca. 

OVCAR- 

3. 


10.5 


18.7 


12.1 


1.7 


16.7 


Colon ca. 
Colo-205 


0.0 


0.0 


0.0 


0.0 


1.2 


Ovarian 
ca. 

SK-OV-3 


0.2 


0.1 


0.0 


0.2 


0.0 


Colon ca. 
SW-48 


0.0 


0.0 


0.0 


0.0 


0.0 


Ovarian 
ca. 

OVCAR- 
4 


0.1 


0.0 


0.0 


0.0 


0.1 


Colon 
Pool 


0.1 


0.0 


0.0 


0.6 


0.1 


Ovarian 
ca. 

OVCAR- 
5 


7.3 


7.1 


0.0 


3.7 


12.1 


Small 

Intestine 

Pool 


3.7 


1.6 


1.6 


1.0 


4.1 


Ovarian 
ca. 

IGROV-l 


1.4 


3.5 


v.V/ 


on 




Stomach 
Pool 


i.o 


U.7 


0.0 


0.4 


0.9 


Ovarian 
ca. 

OVCAR- 
8 


8.5 


13.0 


0.9 


0.5 


10.7 


Bone 

Marrow 

Pool 


0.1 


0.0 


0.0 


0.0 


0.1 


Ovary 


0.1 


0.4 


0.0 


0.0 


1.0 


Fetal 
Heart 


o.o 


D.O 


0.0 


0.3 


O.O 


Breast ca. 
MCF-7 


11.1 


10.2 


5.0 


3.6 


16.4 ] 


Heart Pool 


3.1 ( 


3.0 ( 


).0 1 


17 


3.1 


Breast ca. 
MDA-MB : 
-231 


5.7 ' 


1.8 ; 


1.2 


3 6 


59 ] 


^ympfa ( 
^ode Pool 


1 ^ f 
J.j I 


~\ c\ i 
J.U l 




).6 ( 


).l 


oreasi ca. ( 
BT549 1 


).0 ( 
.0.2 A 


).0 ( 


).0 ( 


3.0 ( 
1 


l 

).0 J 

,1 


7 etal 

Skeletal ( 
Muscle 


).2 C 


).0 ( 


).0 ] 


l.l ( 


).0 


Breast ca. 
T47D 1 


L4 ( 


).0 ; 


5.1 S 


5 

).9 J 
1 


Jkeletal 
/Tuscle . C 
>ool 


1.0 C 


u c 


1.0 c 


>.8 C 


1.1 


Breast ca. r 
MDA-N 1 


1.1 C 


1.2 C 


i.o [c 


).0 C 


,5 | 


Jpleen 1 
'ool 1 


.5 0 


u o 


.5 2 


.3 0 


.6 
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Breast 
Pool 


0.8 


1.9 


0.0 


0.9 


1.5 


Thymus 
Pool 


3.2 


1.7 


ytsE ig-.. 
1.9 


0.7 


2.9 


Trachea 


7.4 


6.4 


0.9 


20.3 


9.5 


CNS 
cancer 
(glio/astro) 
U87-MG 


4.4 


2.6 


0.3 


1.2 


3.2 


Lung 


0.3 


0.0 


0.0 


0.0 


0.1 


CNS 

cancer 

(glio/astro] 

U-118-M 

G 


0.1 


0.0 


0.0 


0.0 


0.0 


Fetal 
Lung 


25.7 


20.9 


1.7 


6.7 


22.2 


CNS 
cancer 
(neuro;met 
) 

SK-N-AS 


0.0 


0.0 


0.0 


0.0 


0.0 


Lung ca. 
NCI-N417 


3.4 


3.6 


0.0 


0.7 


11.6 


CNS 
cancer 
(astro) 
SF-539 


0.2 


0.0 


0.0 


0.0 


0.1 


Lung ca. 
LX-1 


0.1 


0.0 


0.0 


0.0 


0.0 


CNS 
cancer 
(astro) 
SNB-75 


0.1 


0.1 


0.0 


0.0 


0.2 


Lung ca. 
NCI-H146 


26.1 


28.9 


27.9 


7.7 


24.7 


CNS 
cancer 
(glio) 
SNB-19 


2.0 


4.1 


0.0 


0.6 


3.4 


Lung ca. 
SHP-77 


100.0 


100.0 


100.0 


42.9 


98.6 


CNS 
cancer 
(glio) 
SF-295 


2.4 


3.3 


0.4 


0.3 


4 1 


Lung ca. 
A549 


0.9 


1.3 


0.0 


0.0 


1.1 


Brain 
(Amygdal 
a) Pool 1 


13.4 


29.1 


1.8 


4.2 


14.6 


Lung ca. 
NCI-H526 


1.8 


1.1 


0.0 


0.0 


1.9 


Brain 

(cerebellu 

m) 


14.2 


13.4 


0.8 


6.1 


15.6 


Lung ca. 
NCI-H23 


0.0 


0.0 


0.0 • \ 


o.o 1 


0.2 


Brain 
(fetal) 


89.5 


100.0 


15.1 


100.0 


93 3 

/J.J 


Lungca. 
NCI-H460 


5.4 


3.3 


9.3 


48.3 


23.5 


Brain 
(Hippoca 
mpus) 
Pool 


35.4 


47.3 


6.6 


13.7 


31.9 


Lung ca. 
HOP-62 


7.0 


8.8 


O.O 


0.0 


8.4 


Cerebral 

Cortex 

Pool 


40.1 


53.2 


8.9 


35.1 


39.0 


Lung ca. 
NCI-H522 


3.0 


0.0 


3.0 


0.0 


0.0 


Brain 
(Substanti 
i nigra) 
Pool 


14.2 : 


33.7 < 


t.7 : 


1.2 


16.7 
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Liver 


0.0 


0.0 


0.0 


0.0 


0.0 


Brain 

(Thalamus 

)Pool 


POT 
37.9 


43.2 


0.8 


f3A 

25.5 


45.1 


Fetal 
Liver 


n a 
u.u 


a a 
U.U 


0.0 


O.O 


0,2 


Brain 
(whole) 


13.9 


25.7 


2.1 


13.4 


18.6 


Liver ca. 
HepG2 


a a 


A 1 


A A 

0.0 


A A 

0.0 


0.0 


Spinal 
Cord Pool 


2.2 


2.6 


1.7 


1.4 


2.4 


Kidney 
Pool 


1 A 
l.U 


1 A 
l.U 


r\ a 

u.o 


A A 

0.4 


1.6 


Adrenal 
Gland 


0.7 


0.7 


0.8 


1.9 


0.3 


Fetal 
Kidney 


8.5 


6.9 


1.0 


6.5 


9.2 


Pituitary 
gland Pool 


18.6 


16.3 


5.0 


3.2 


36.6 


Renal ca. 
786-0 


0.0 


0.0 


0.0 


0.0 


0.0 


Salivary 
Gland 


0.1 


0.5 . 


rtn 

0.0 


■■ ■■ 

0.0 


0.1 


Renal ca. 
A498 


0.1 


0.0 


0.0 


0.0 


0.0 


Thyroid 
(female) 


11.6 


12.2 


0.2 


0.7 


9.4 


Renal ca. 
ACHN 


0.4 


0.1 


0.0 


0.0 


0.5 


Pancreatic 
ca. 

CAPAN2 


0.1 


0.0 


0.0 


0.0 


0.1 


Renal ca. 
UO-31 


0.0 


0.1 


0.0 


0.0 


0.0 


Pancreas 
Pool 


3.6 


2.3 


0.0 


1.7 j 


2.0 



Table AFX General screening panel vl.6 



Tissue Name 


Re). 

Exp.(%) 
Ag5243, 
Run 

27721871 
9 


Rel. 

Ex.(%) 

Ag5243, 

Run 

2777299 

29 


Rel. 

Exp.(%) 
Ag5245, 
Run 

27721969 
7 


Rel. 

Exp.(%) 
Ag5245, 
Run 

27773087 
9 


Rel. 

Exp.(%) 
Ag5247, 
Run 

27721969 
9 


Rel. 

Exp.(%) 
Ag5247, 
Run 

27772993 
3 


Rel. 

Exp.(%) 
Ag5248, 
Run 

27721970 
1 


Rel. 

Exp.(%) 
Ag5248, 
Run 

27773088 
1 


Adipose 


0.1 


0.2 


0.0 


0.0 


0.0 


0.0 


0.1 


0.1 


Melanoma* 
Hs688(A).T 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Melanoma* 
Hs688(B).T 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.2 


0.0 


Melanoma* 
M14 


0.2 


0.0 


0.7 


0.0 


0.0 


0.0 


0.0 


0.3 


Melanoma* 
LOXIMVI 


0;2 


0.1 


0.0 


0.0 


0.0 


0.0 


0.1 


0.2 


Melanoma* 
SK-MEL-5 


2.5 


1.3 


0.0 


0.0 


0.0 


0.9 


0.1 


0.4 


Squamous ceil 

carcinoma 

SCC-4 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


Testis Pool 


2.2 


3.4 


3.1 


2.3 


7.1 


3.5 


2.7 


2.8 


Prostate ca.* 
(bone met) 
PC-3 


95.3 


76.8 


11.5 


1.3 


23.7 


20.3 


76.8 


63:3 
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Prostate Pool 


6.8 


7.5 


0.0 


0.0 


-6T™ 








Placenta 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


0.1 


Uterus Pool 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


wvtui«m td. 

OVCAK-3 


13.2 


11.7 


9.5 


4.0 


3.3 


5.2 


11.6 


14.5 


KJV dXla.it 

SK-OV-3 


0.2 


0.3 


0.0 


0.0 


0.0 


0.0 


0.1 


0.3 


W Vol IcUl Ca. 

OVCAR-4 


0.0 


0.0 


00 


0.0 


0.0 


0.0 


0.0 


0.0 


Ovarian ca. 
OVCAR-5 


6.6 


7.4 


2.3 


0.0 


4.7 


0.8 


4.7 


5.1 


Ovarian ca. 
IGROV-1 


2.0 


2.8 


0.7 


0.0 


0.0 


0.0 


1.1 


3.3 


uvanan ca. 
OVCAR-8 


14.2 


8.1 


3.6 


0.0 


7.5 . 


8.1 


8.2 


13.4 


Ovary 


a i 
U.I 


A < 

U.O 


A A 
U.U 


A A 
U.U 


U.O 


U.U 


0.7 


0.2 


Breast ca. 


7.4 


8.0 


0.0 


0.0 


3.5 


9.4 


8.0 


9.2 


Breast ca. 

MJUA-Me>-Z3 

1 


6.5 


3.0 


2.4 


2.5 


0.4 


0.7 


4.1 


6.4 


Breast ca. BT 
549 


0.0 


0.0 


0.0 


0.0 


0.0 


1.0 


0.6 


0.0 


Breast ca. 
T47D 


6.7 


3.8 


0.8 


0.0 


5.5 


1.5 


4.7 


8.0 


Breast ca. 
MDA-N 


0.0 


0.2 


0.5 


0.0 


0.0 


0.5 


0.1 


0.3 


Breast Pool 


0.2 


0.1 


0.9 


0.0 


0.0 


0.0 


0.5 


0.3 


Trachea 


18.6 


15.6 


3.9 


0.0 


14.6 


18.0 


5.5 


7.6 


Luiig 


0.2 


0.0 


0.0 


0.0 


0.0 


1.2 


0.0 


0.1 


Fetal Lung 


21.3 


21.0 


0.0 


0.7 


10.3 


5.1 


19.3 


TPi 7 


Lung ca. 
NCI-N417 


6.3 


3.2 


0.0 


0.0 


1.7 


4.2 


2.4 


2.0 


Lung ca. LX-1 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


00 

V/.V/ 


Lungca. 
NCI-H146 


23.3 


20.4 


17.0 . 


100.0 


7.1 


9.8 


16.8 


16.4 


Lung ca. 
SHP-77 


95.9 


77.9 


100.0 


35.6 


24.7 


31.9 


100.0 


76.3 


Lung ca. A549 


1.0 


0.4 


0.0 


0.0 


3.0 


3.0 


3.3 


1.1 


jun p ca. 
NCI-H526 


1.4 


1.9 


0.0 


0.O 


3.0 


3.0 


3.7 


3.5 


Lung ca. 
NC3-H23 


3.0 


3.0 


3.0 


3.0 


3.0 


3.0 


3.0 


3.1 


Lung ca. 
Na-H460 


2.8 


Z.1 


3.0 


3.0 


3.9 


3.9 


u : 


5.4 


Lung ca. 
HOP-62 


12.4 ( 


3.5 


).o 


).0 


3.6 


3.0 < 


).4 


11.6 
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Lung ca. 
NCI-H522 

liver 


0.0 


0.0 

i 


0.0 


0.0 


- r>iP 

0.0 


0.0 


0.0 


It Jit "dO 1 ' T 
00 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


O.Q 


Fetal Liver 


0.0 


0.0 


0.0 


0.0 


0.0 


0.9 


0.2 


0.0 


Liver ca. 
HepG2 


0.2 


0.0 


U.U 


ft A 


ft ft 
U.U 


ft ft 

U.U 


U.U 


0.1 


Kidney Pool 


0.5 |0.9 


0.0 


0.0 


1.0 


0.0 


0.6 


1.8 


Fetal Kidney 


5.8 |6.8 


0.0 


0.0 


11.4 


6.6 


4.3 


7.9 


Renal ca. 
786-0 


0.0 


jo.o 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Renal ca. 
A498 


0.0 


0.0 
0.2 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Renal ca. 
ACHN 


0.0 


0.0 


0.0 


0.0 


0.0 


0.2 


0.1 


Renal ca. 
UO-31 


0.2 


0.2 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


Renal ca. 
TK-10 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 . 


Bladder 


1.2 


1.5 


0.0 


0.0 


3.8 


1 4 




'y 


Gastric ca. 
(liver met.) 
NCI-N87 


0.0 


0.0 


0.0 


0.0 


00 


0 0 


00 




Gastric ca. 
KATOHI 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Colon ca. 
SW-948 


4.0 


4.4 


0.7 


0.0 


2.8 


0.6 


3.6 


3.8 


Colon ca. 
SW480 


3.6 


4.0 


0.5 


0.0 


0.0 


2.3 


2.7 


4.2 


Cblonca.* 

(SW480met) 

SW620 


0.2 


0.0 


0.0 


0.0 


0.0 


0 0 




U.XJ 


Colon ca. 
HT29 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Colon ca. 
HCT-116 


13.8 


12.7 


1.0 


0.0 


6.8 


3.1 


5.6 


14.7 


Colon ca. 
CaCo-2 


18.8 


14.9 


10.8 


4.7 


10.2 


10.1 


2.4 


11.6 


Colon cancer 
tissue 


0.0 


10 


O.O 


3.0 


O.O 


10 


0.0 


11 


Colon ca. 
SW1116 


DO 


3.0 


10 1 


10 


10 


10 


3.0 


10 


Colon ca. { 
CoIo-205 


3.0 < 


).0 


3.0 ( 


3.0 


10 ( 


10 ( 


3.0 < 


10 


Colon ca. ( 
SW-48 


10 ( 


).0 ( 


10 ( 


10 


10 ( 


10 ( 


10 ( 


).0 


Colon Pool ( 


).l ( 


).0 ( 


10 ( 


).0 ( 


19 C 


10 ( 


14 ( 


).l 


Small Intestine ( 
Pool 1 


).7 ] 


1.4 ] 


1.6 ] 


.6 ( 


17 1; 


>.0 { 


5.9 ] 


..7 
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Stomach Pool 


0.6 


1.0 


0.0 


0.0 o.o a ^ 


*°. 


fip, 


1 -II "n> -3F- 

0.6 


Bone Marrow 
Pool 


U,\J 


n 1 




0.0 


0.0 






n i 


Fetal Heart 


0.0 


0.0 


0.0 


0 0 


on 


0.0 


0.0 


0.1 


Heart Pool 


0.0 


0.0 


0.0 




n n 

u.u 


0.0 


0.0 


0.0 


Lymph Node 
Pool 


fi ft 


n 7 




0.0 


0.8 


a n 
v/.u 


U.J 




Fetal Skeletal 
Muscle 


0.4 


0.1 


0.0 


0.0 


0.0 


0.0 


0.1 


0.2 


Skeletal 
Muscle Pool 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


Spleen Pool 


0.0 


0.1 


0.6 


0.0 


1.4 


0.0 


0.6 


0.5 


Thymus Pool 


2.0 


2.1 


1.0 


0.7 


1.4 


2.6 


1.9 


3.2 


CNS cancer 

(glio/astro) 

U87-MG 


1 fx 


Z.J 


U.o 


0.0 


0.7 


u.o 






CNS cancer 

(glio/astro) 

U-118-MG 


0.3 


0.1 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


(jJNi> cancer 
(neuro;met) 
SK-N-AS 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


CNS cancer 
( astro) SF-539 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


0.1 


CNS cancer 

(astro) 

SNB-75 


0.2 


0.3 


0.0 


0.0 


0.0 


0.0 


0.5 


0.3 


CNS cancer 
(g!io)SNB-19 


3.1 


2.4 


0.0 


0.0 


0.0 


1.1 


1.9 


3.4 


CNS cancer 
(gHo)SF-295 


2.8 


2.2 


0.5 


0.6 


0.9 


2.6 


3.1 


2.8 


Brain 

(Amvedala) 
Pool 


23.2 


18.7 


1.0 


2.6 


7.1 


2.2 


12.2 


14.0 


Brain 

(cerebellum) 


13.8 


11.7 


3.1 


1.0 


10.2 


11.3 


13.3 


14.1 


Brain (fetal) 


100.0 


100.0 


20.6 


14.8 


100.O 


100.0 


73.2 


100.0 


Brain 

(Hippocampus 
)Pool 


51.1 


40.3 


6.9 


5.3 


25.9 


14.3 


26.8 


35.8 


^ereoraj 
Cortex Pool 


52.5 


52.5 


8.2 


0.0 


27.0 


20.9 


31.9 


31.0 


Brain 

(Substantia 
nigra) Pool 


29.5 


29.1 


1.1 


1.7 


5.5 


2.9 


9.7 


12.2 


Brain 

(Thalamus) 
Pool 


48.3 


51.1 


2.2 


2.5 


21.9 


25.2 


17.4 


31.0 


Brain (whole) 


28.7 


30.6 


6.0 


4.2 


15.2 


13.3 


9.2 


14.7 
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Spinal Cord 

rOOl 


1.9 


1.3 


1.3 


0.0 


i PME 

1.0 


tWHJO 
In o 


1.0 


pL'3?' 2 

[2.2 


Adrenal Gland 


0.4 


0.8 


[1.5 


0.0 


°-° 


j0.8 


|0.3 


OA " 


Pituitary gland 
Pool 


17.9 


13.7 


2.6 


7.4 


0.0 


11.1 


13.4 


15.8 


Salivary Gland 


0.2 


0.3 


0.0 


0.0 


0.0 


0.0 


0.3 


0.6 


Thyroid 
(female) 


12.9 


10.0 


1.4 


0.0 


1.5 


0.8 


8.5 


13.9 


Pancreatic ca. 
CAPAN2 


0.0 


0.0 


0.0 


0.0 


0.0 


0.0 


0.1 


0.0 


Pancreas Pool j 


?; 6 1 


3.2 


0.0 


0.0 


0.6 


3.6 


4.5 


3.7 



Table AFK. Panel 4.1TH 

5 



Tissue Name 


[Rel. 

Exp.() 
Ag524 
2, 

[Run 
229819 
1771 


[Rel. 
Exp.( 

%) 
Ag52 
45, 
, Run 
22981 
9577 


Rel 
Exp.( 

%) 

Ag5& 
7, 

Run 

22981 

9792 


Rel. 
Exp.( 
%) 
I Ag52 
48, 
Run 
22981 
9793 


Tissue Name 


Rel. 

Exp.< 

%) 

42, 
Run 

22981 
97,71 


Rel. 

: Exp.( 
%) 

5, 

Run 
I 22981 
9577 


Rel. 

Exp.i 

%) 

\ AgpZ 

47, 
Run 
22983 
9792 


Rel. 

[ Exp.( 
%) 

Ag524 
8, 

Run 
[ 22981 
9793 


Secondary Thl 
act 


jo.o 


0.0 


0.0 


0.0 


HUVECIL-lbeta 


0.2 


0.0 


0.0 


0.1 


Secondary Th2 
act 


jO.6 


4.1 


0.7 


0.5 


HUVECIFNgaimna 


0.0 


0.0 


0.0 


0.0 


Secondary Trl act 


(2.3 


1.2 


0.6 


2.3 


HUVECTNF alpha 
+ IFN gamma 


6.0 


0.0 


2.4 


7.7 


Secondary Thl 
rest 


0.0 


0.0 


0.0 


0.1 


HUVECTNF alpha 
+ IL4 


i.o 


0.0 


0.6 


4.2 


Secondary Th2 
rest 


13.7 


0.6 


5.1 


122 


fflJVEC IL-11 


9.6 


1.6 


6.4 


9.2 


Secondary Trl 
rest 


15.5 


1.9 


8.7 


14.0 


-ung Microvascular 
BC none 


3.6 


0.9 


1.0 


2.4 


Primary Thl act JlOO.0 


71.7 


100.0 


1 

85.3 I 
I 


-ung Microvascular 

3CTNFaipha-f 

L-lbeta 


0.0 


O.O 


O.O 


O.O 


Primary Th2 act J 


27.9 


12.6 : 


>0.4 


2g 2 [Microvascular 
jDermal EC none 


3.1 


).0 


).0 ( 


).3 


Primary Trl act : 


J6.6 < 




54.3 : 


jMicrosvasuIar 
*8.9 Dermal EC ( 
|TNFalpha + IL-lbeta 


).0 ( 


).0 


1.0 ( 


).0 


Primary Thl rest J] 


15.9 1 


!.9 5 


-1 1 


a g [Bronchial epithelium f 
ITNFalpha + lLlbeta 1 


u c 


1.0 C 


».0 c 


1.0 


Primary Th2 rest 2 


14.2 3 


-4 J23.3 2 


9 l jSmall airway 
[epithelium none 


.2 0 


.0 jo 


.0 o 


.2 
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Primary Trl rest 


12.0 


5.0 


12.7 


12.9 


Small airway " ^ 
epithelium TNFalpha 
+ IL-lbeta 


3.1 


HB©- 
0.0 


:^3^ 
0.7 


v. TTtm-ff. 
3.7 


3 


CD45RACD4 
lymphocyte act 


0.6 


0.0 


0.0 


0.0 


Coronery artery 
SMC rest 


4.1 


0.0 


0.6 


3.6 


CD45RO CD4 
lymphocyte act 


0.0 


0.0 


0.0 


0.2 


Coronery artery 
SMC TNFalpha + 
IL-lbeta 


3.1 


0.0 


0.0 


2.6 




CDS lymphocyte 
act 


5.6 


2.9 


0.7 


7.3 


Astrocytes rest 


3.8 


0.9 


0.6 


4.0 




Secondary CDS 
lymphocyte rest 


0.0 


0.0 


0.0 


0.0 


Astrocytes TNFalpha 
+ IL-lbeta 


0.0 


0.0 


0.0 


0.0 




Secondary CD8 
lymphocyte act 


2.1 


0.0 


0.0 


1.9 


KU-812 (Basophil) 
rest 


0.0 


0.0 


0.0 


0.0 




CD4 Ivnrnhocvfe 
none 


8.1 


1.2 


5.8 


7.4 


KIT-812 fBasonhiH 
PMA/ionomycin 


12.6 


1.0 


4.5 


15.4 




2ry 

Thl/Th2/Trl_anti 
-CD95CH11 


0.0 


0.0 


0.0 


0.0 


CCD1106 

(Keratinocytes) none 


15J 


15.5 


4.3 


15.8 




LAK cells rest 


0.1 


0.0 


0.6 


0.1 


CCD1106 
(Keratinocytes) 
TNFalpha + IL-lbeta 


0.0 


0.0 


0.0 


0.0 


LAK cells TL-2 


0.3 


0.0 


0.0 


0.3 j 


Liver cirrhosis 


0.1 


0.0 


0.0 


O0 


LAK cells 
IL-2+IL-12 


25.2 


3.1 


4.8 


24.0 


NCI-H292none 


0.0 


0.0 


0.0 


0.0 


LAK cells 
IL-2+IFN gamma 


0.2 


0.0 


mmm .i 

0.0 


1.1 


NCI-H292 IL-4 


0.0 


0.0 


0.0 


0.0 


LAK cells EL-2+ 
IL-18 


0.5 


0.0 


0.7 


0.7 


NCI-H292 IL-9 


0.0 


0.0 


0.0 


0.0 


LAK cells 
PMA/ionomycin 


0.2 


0.0 


0.6 


0.0 


NCI-H292 EL-13 


0.2 


0.0 


0.0 


0.0 


NK Cells IL-2 
rest 


0.5 


1.9 


0.0 


0.5 


NCI-H292 JDFN 
gamma 


0.0 


0.0 


0.0 


0.0 


Two Way MLR 3 
day 


4.5 


5.1 


0.7 


2.3 


HPAEC none 


0.0 


0-0 


0.0 


0.0 


Two Way MLR 5 
day 


6.7 


14.9 


9.5 


15.0 


HPAECTNF alpha 
+ EL-1 beta 


0.1 


nn 

\Jw\J- 


0 0 

\J.\r 


0 0 


Two Way MLR 7 
day 


0.2 


0.0 

1 


0.0 


0.1 


LunP" fibrnhtast rinnft 


0.0 

a 


0.0 


DO 


00 


PBMC rest . 


8.7 


0.0 J 


2.3 


6.0 


Lung fibroblast TNF 
alpha + IL-1 beta 




7*5 7 


Tit 




PBMCPWM 


0.2 


00 1 


0.0 


0.4 


Lung fibroblast EL-4 


72.2 


100.0 


32.8 


49.7 


PBMC PHA-L 


0.2 


o.o 


0.0 


0.1 


Lung fibroblast EL-9 


1.2 


O.O 


0.4 


0.6 


Ramos (B cell) 
none 


3.6 


2.2 


1.1 


1.9 


Lung fibroblast 
DL-13 


1.8 


3.0 


1.5 


1.2 


Ramos (B cell) 
ionomycin 


1.8 


3.6 


1.5 


2.2 


Lung fibroblast IFN 
gamma 


3.0 


3.0 


3.0 1 


3.0 


B lymphocytes 
PWM 


1.3 1 


3.0 


2.0 


u 


Dermal fibroblast 
2CD1070rest 


).l ( 


3.0 < 


).0 ( 


3.0 
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B lymphocytes 
CD40Land]L-4 


0.8 


0.7 


1.2 


•• 

1.5 


Dermal fibroBlist ' 
CCD1070 TNF alpha 


2.9 


0.0 


1.3 


.3 J' j 
5.3 


EOL-1 dbcAMP 


3.7 


6.7 


3.3 


2.0 


Dermal fibroblast 
CCD1070IL-1 beta 


6.3 


0.0 


1.7 


7.7 


EOL-1 dbcAMP 
PMA/ionomycin 


3.0 


0.0 


2.3 


2.0 


Dermal fibroblast 
IFN gamma 


0.0 


0.0 


0.0 


0.0 


Dendritic cells 
none 


1U./ 


1 n 

1.5* 


3.8 


13.6 


Dermal fibroblast 
IL-4 


0.0 


0.0 


0.0 


0.0 


Dendritic cells 
LPS 


4.7 


6.2 


11.7 


8.2 


Dermal Fibroblasts 
rest 


0.0 


on 


00 




Dendritic cells 
anti-CD40 


0.1 


0.0 


0.0 


0.0 


Neutrophils 


0.1 


0.0 


0.0 


0.0 


Monocytes rest 


11.6 


0.6 


2.8 


16.4 


Neutrophils rest 


87.7 


11.7 


28.3 


100.0 


Monocytes LPS 


4.6 


5.6 


1.4 


5.4 


Colon 


0.0 


0.0 


0.0 


0.0 


Macrophages rest 


0.2 


0.0 


0.0 


0.1 


Lung 


0.2 


0.0 


0.0 


0.3 


Macrophages LPS 


11.5 


0.0 


0.9 


9.2 


Thymus 


0.1 


0.0 


0.0 


0.6 


HUVECnone |0.3 


0.0 


0.0 


0.5 


Kidney 


0.1 


0.0 


1.4 


0.6 


HUVEC starved |l5.9 


8.4 


2.4 


15.5 







Table AFL. general oncology screening panel v 2.4 



Tissue Name 


Rel. 

Exp.(%) 
Ag5242, 
Run 

26026908 
3 


ReL 

Exp.(%) 
Ag5247, 
Run 

26026913 
2 


Rel. 

Exp.(%) 
Ag5248, 
Run 

26026913 
3 


issue Name 


Rel. 

Exp.(%) 
Ag5242, 
Run 

26026908 
3 


ReL 

Exp.(%) 
Ag5247, 
Run 

26026913 
2 


ReL 

Exp.(%) 
Ag5248, 
Run 

26026913 
3 


Colon cancer 1 


0.0 


0.0 


3.5 


Bladder cancer 
NAT 2 


0.0 


0.0 


0.0 


Colon cancer 
NAT 1 


7.2 


0.0 


11.0 


Bladder cancer 
NAT 3 


0.0 


0.0 


0.0 


Colon cancer 2 


0.0 


0.0 


0.0 


Bladder cancer 
NAT 4 


0.0 


0.0 


0.0 


Colon cancer 
NAT 2 


17.6 


16.6 


15.7 


Prostate 

adenocarcinoma 

1 


2.4 


20.9 


5.8 


Colon cancer 3 


4.5 


0.0 


3.8 


Prostate 

adenocarcinoma 

2 


0.0 


0.0 


2.0 


Colon cancer 
NAT 3 


37.1 


0.0 


27.0 


Prostate 

adenocarcinoma 
3 


71.7 


55.9 


54.3 


Colon 
malignant 
cancer 4 


6.1 


0.0 


1.0 


Prostate 

adenocarcinoma 
4 


1.0 


0.0 


7.2 
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Colon normal 
adiflppnt tissue 

4 


0.0 


0.0 


24 




Prostate cancer 
NAT 5 


■WO 5 


0 0 


n n 
u.u 


Limp" cancer 1 


25.0 


17.9 


4.2 


Prostate 

3fff*n r^arcirirvrna 

autiiucaiLjuv/tiia 

6 


30.6 


4 S 


XI. 1 


Lunff NAT 1 








Prostate 

aUCliUV-'alL-lllUiJld 

7 


144 






Lung cancer 2 


40.1 


100.0 


100.0 


JTlOSLalC 

adenocarcinoma 
8 


9.1 


5.0 


6.8 


Lung NAT 2 


32.3 


18.2 


48.6 


Prostate 

adenocarcinoma 

9 


75.3 


10.7 


31.0 


Squamous cell 
carcinoma 3 


73.2 


47.0 


82.4 


Prostate cancer 
NAT 10 


0.0 


0.0 


7.1 


Lung NAT 3 


13.3 


3.5 


5.8 


Kidney cancer 1 


0.0 


0.0 


0.0 


metastatic 
melanoma 1 




fi ft 








11./ 


in 7 
1U. / 


Melanoma 2 


0.0 ' 


0.0 


1.4 


Kidney cancer 2 


10.7 


7.4 


2.8 


Melanoma 3 


9.8 


0.0 


4.2 


Kidney NAT 2 


100.0 


42.9 


51.4 


metastatic 
melanoma 4 


2.1 


0.0 ; 


1.0 


Kidney cancer 3 


61.1 


8.6 


24.8 


metastatic 
melanoma 5 


0.4 


y.3 


L.L 


Kidney NAT 3 


63.3 


16.0 


29.9 


Bladder cancer 
1 


0.0 


0.0 


o.b 


Kidney cancer 4 


8.8 


0.0 


1.9 


Bladder cancer 
NAT 1 


0.0 


0.0 


0.0 


Kidney NAT 4 


5.3 


0.0 


9.2 


Bladder cancer 
2 


2.1 


0.0 


0.0 









AI_comprehensive paneLvl.O Summary: Ag5242 Highest expression is seen in 
osteoarthritic bone sample (CT=27.5). Prominenet levels of expression are seen in a cluster 
5 of samples derived from RA. Thus, expression of this gene could be used to differentiate 
between these samples and other samples on this panel and as a marker of rheumatoid 
arthritis. In addition, modulation of the expression or function of this gene may be useful in 
the treatment of RA. 

CNSjneurodegeneratioiuvLO Summary: Ag5242/Ag5243/Ag5247/Ag5248 
10 Multiple experiments with four different probe and primer sets produce results that are in 
reasonable agreement. These panels do not show differential expression of this gene in 
Alzheimer's disease. However, these profiles confirm the expression of this gene at 
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moderate levels in the brain. Please see Panel 1.5 for discMn1Mi|!|^^^4^y7 3 
nervous system. 

Ag5244 Three experiments with Ag5244, which is specific for CG150799-03, 
detect expression of this gene at low but significant levels in the hippocampus and temporal 
cortex of Alzheimer's patients. This expression may suggest an involvement of this gene 
product in the etiology of this disease. 

One experiment with Ag5244 (Run 276863567) and two experiments with Ag5245 
(Run 276863569 and Run 277731463), also specific for CG150799-03, show 
low/undetectable levels of expression (CTs>35). (Data not shown). Two additional 
experiments with Ag5245 show low expression in samples from the parietal cortex of a 
normal patient and the inferior temporal cortex of an Alzheimer's patient. 

General_screening_panel_vl.5 
Summary: Ag5242/Ag5243/Ag5245/Ag5247/Ag5248 Multiple experiments with five 
different probe and primer sets produce results that are in reasonable agreement. Highest 
expression is seen in cell lines from lung and prostate cancers and the fetal brain 
(CTs=28-30). This gene, which encodes a MASS1 homolog, appears be preferentially 
expressed in the brain, with prominent levels of expression in all regions of the CNS 
examined. MASS1 is a large, calcium-binding GPCR expressed in the central nervous 
system that may play a fundamental role in its development (MacMillan, J Biol Chem 2002 
Jan 4;277(l):785-92). In addition, this gene has been associated with some 
nonsymptomatic epilepsies (Skardski, Neuron, Vol 31, 537-544, August 2001). Thus, based 
on the homology of this protein to MASS1 and the preferential expression in the brain, 
expression of this gene could be used to differentiate between brain and non-neural tissue. 
In addition, therapeutic modulation of the expression or function of this gene may be useful 
in the treatment of neurological disorders, such as Alzheimer's disease, Parkinson's disease, 
schizophrenia, multiple sclerosis, stroke and epilepsy. 

Moderate levels of expression are also seen in samples from lung, colon, ovarian 
and prostate cancer cell lines. This suggests that expression of this gene could be used as a 
marker of these cancers. Futhermore, therapeutic modulation of the expression or function 
of this gene may be useful in the treatment of these cancers. 

Ag5244 Expression of this gene is low/undetectable (CTs > 35) across all of the 
samples on this panel. 
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Generaljscreening_panel_vl.6 Summary: Ag^w'J^^MQS^A^^ 7 3 
Multiple experiments with three different probe and primer sets produce results that are in 
very good agreement. Highest expression is seen in a lung cancer cell line and the fetal 
brain (CTs=27-32). Overall, expression is in excellent agreement with Panel 1.5, with 
5 prominent expression seen in all regions of the CNS, and lung and prostate cancer cell 
lines. Please see Panel 1.5 for further discussion of this gene. 

Ag5244 Expression of this gene is low/undetectable (CTs > 35) across all of the 
samples on this panel. 

Panel 4.1D Summary: Ag5242/Ag5243/Ag5247/Ag5248 Multiple experiments 
10 with four different probe and primers sets show highest expression of this gene in primary 
activated Thl cells and resting neutrophils (CTs=27-31). Since this gene is expressed 
predominantly in activated Th-1 vs Th-2 cells, regulation of the expression of this gene 
might also be important for autoimmune disease such as rheumatoid arthritis (please see 
also AI panel). Moderate levels of expression are also seen in EL-4 treated lung fibroblasts 
15 and resting neutrophils. Thus, therapeutic regulation of the transcript or the protein encoded 
by the transcript could be important in immune modulation and in the treatment of T 
cell-mediated diseases such as asthma, arthritis, psoriasis, BBD, and lupus. 

Ag5245 Highest expression of this gene is seen in EL-4 treated lung fibroblasts 
(CT=:32). Low but significant expression is also seen in TNF-a/DLl-b treated lung 
20 fibroblasts and primary activated Thl cells. Three experiments with the probe and primer 
set Ag5244 show low/undetectable levels of results (CTs>35). 

general oncology screening panel_v_2.4 
Summary: Ag5242/Ag5243/Ag5247/Ag5248 Four experiments with the different probe 
and primer sets show highest expression in a lung cancers and normal kidney tissue 
25 adjacent to a tumor (CTs=31-34). Overall, this gene is expressed at low but significant 
levels in prostate cancer, normal kidney and kidney cancer, squamous cell carcinoma and 
normal colon. Therefore, therapeutic modulation of this gene or its protein product may be 
useful in the treatment of lung, prostate and kidney cancers. 

Ag5244/Ag5245 Expression of this gene is low/undetectable in all samples on this 
30 panel (CTs>35). 

AG- CG151014-01: Metabotropic glutamate receptor 3-variant 
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Expression of gene CG151014-01 was assessed J^S^^M^^Se{e^<sSl7, 3 
described in Table AGA. Results of the RTQ-PCR runs are shown in Tables AGB, AGC 
andAGD. 

Table AGA. Probe Name Ag5219 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -tgattgtgaattgcagttcagt-3 ' 


22 


2550 


381 


Probe 


TET-5 1 -aagtgctcacgtgcagctccagaata 
-3 ' -TAMRA 


26 


2598 


382 


Reverse 


5 '-gtactagggttgttcttttgctct-3 ' 


24 


2631 


383 



Table AGB. CNS neurodegeneratton vl.O 

10 



Tissue Name 


Rel. 

ITvn (<fa\ 

HjX.\)\ /O ) 

Ag5219, 
Run 

228020421 


issue Name 


ReL 

Exp.(%) 

Ap5219 

Run 

228020421 


AD 1 Hippo 


9.4 


Control (Path) 3 Temporal Ctx 


6.5 


AD 2 Hippo 


24.8 


Control (Path) 4 Temporal Ctx 


25.0 


AD 3 Hippo 


6.3 


AD 1 Occipital Ctx 


15.7 


AD 4 Hippo 


7.6 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


53.2 


AD 3 Occipital Ctx 


6.8 


AD 6 Hippo 


24.1 


AD 4 Occipital Ctx 


33.2 


Control 2 Hippo 


40.9 


AD 5 Occipital Ctx 


51.8 


Control 4 Hippo 


6.7 


AD 6 Occipital Ctx 


15.3 


Control (Path) 3 Hippo 


5.6 


Control 1 Occipital Ctx 


7.6 


AD 1 Temporal Ctx 


19.1 


Control 2 Occipital Ctx 


46.0 


AD 2 Temporal Ctx 


34.9 


Control 3 Occipital Ctx 


16.6 


AD 3 Temporal Ctx 


5.6 


Control 4 Occipital Ctx 


8.5 


AD 4 Temporal Ctx 


25.3 j 


Control (Path) 1 Occipital Ctx 


90.1 


AD 5 Inf Temporal Ctx 


100.0 | 


Control (Path) 2 Occipital Ctx 


11.5 


AD 5 Sup Temporal Ctx 


32.5 j 


Control (Path) 3 Occipital Ctx 


3.8 


AD 6 Inf Temporal Ctx 


44.1 


Control (Path) 4 Occipital Ctx 


11.9 


AD 6 Sup Temporal Ctx 


32.5 


Control 1 Parietal Ctx 


9.5 


Control 1 Temporal Ctx 


10.5 


Control 2 Parietal Ctx 


40.6 


Control 2 Temporal Ctx 


45.4 


Control 3 Parietal Ctx 


18.3 


Control 3 Temporal Ctx 


28.9 


Control (Path) 1 Parietal Ctx 


74.2 | 


Control 3 Temporal Ctx 


10.1 


Control (Path) 2 Parietal Ctx 


27.5 


Control (Path) 1 Temporal Ctx 


55.1 


Control (Path) 3 Parietal Ctx 


5.0 


Control (Path) 2 Temporal Ctx 


36.1 


Control (Path) 4 Parietal Ctx : 


56.3 
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Table AGC. General screening panel vl.5 



5 



1 issue Name 


ReL 

Exp.(%) 
Ag5219, 
Run 


issue Name 


Rel. 

Exp.(%) 
Ag5219, 
Run 

228758224 


Adipose 


0.3 


Renal ca.TK-10 


0.4 


Melanoma* Hs688(A).T 


0.0 


Bladder 10.2 


Melanoma* Hs688(B).T 


00 


Gastric ca. (liver met.) NCI-N87 


16.6 


Melanoma* M14 


00 

Um\J 


Gastric ca. KATO D3 


0.0 


Melanoma* LOXIMVI 




Colon ca. SW-948 


0.1 


Melanoma* SK-MEL-5 


O R 
u.o 


Colon ca. SW480 


0.6 


Squamous cell carcinoma SCC-4 


08 


Colon ca.* (SW480 met) SW620 


1.1 


Testis Pool 


04 


Colon ca. HT29 


0.0 


Prostate ca * fhnnp mprt pr^ 


Z.I 


Colon ca.HCT-1 16 


1.7 




0 ^ 


Colon ca. CaCo-2 


0.7 


Placenta 


0.0 


Colon cancer tissue 


0.0 


Uterus Pool 


O 9 


Colon ca. SW1116 


0.0 


Ovarian ca. OVCAR-3 


1 O 


Colon ca. Colo-205 


0.0 


Ovarian ca. SK-OV-3 


0 0 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR-4 


00 


Colon Pool 


0.7 


Ovarian ca. OVCAR-5 


0 7 


Small Intestine Pool 


0.7 1 


Ovarian ca. IGROV-1 


0 O 


Stomach Pool 


1.4 


Ovarian ca. OVCAR-8 


0.1 


Bone Marrow Pool 


6.1 


Ovary 


0 1 


Fetal Heart 


0.6 


Breast ca. MCF-7 


00 


Heart Pool 


0.3 


Breast ca. MDA-MB-231 




Lymph Node Pool 


1.1 


Breast ca. BT 549 


DO 


Fetal Skeletal Muscle 


0.1 


Breast ca.T47D 


10 


Skeletal Muscle Pool 


5.7 


Breast ca. MDA-N 


1 0 


Spleen Pool 


1.4 


Breast Pool 


i.6 


rhymusPool 


).4 


Trachea ( 


).4 < 


3SfS cancer (glio/astro) U87-MG ] 


1.0 


Lung J 


).2 ( 


3NS cancer (glio/astro) U-l 18-MG ( 


).i 


Fetal Lung \ 


1.8 ( 


cancer (neuro;met) SK-N-AS i 


.4 


Lungca.NCL-N417 


).l ( 


?NS cancer (astro) SF-539 C 


>.0 


Lungca. LX-1 4 


1.5 C 


?NS cancer (astro) SNB-75 0 


.0 


Lung ca. NCI-H146 " 1 


.1 C 


NS cancer (glio) SNB-19 0 


.0 


Lungca.SHP-77 3 


.3 C 


^NS cancer (glio) SF-295 0 


.0 


Lung ca. A549 0 


.0 B 


rain (Amygdala) Pool 6 


0.3 


Lungca.NCI-H526 0 


.3 B 


rain (cerebellum) 1 


00.O 


Lungca.NCI-H23 0 


.4 B 


rain (fetal) 6 


6.4 
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Lung ca. NCI-H460 


{0.9 


Brain (Hippocampus)- Pool 




Lung ca. HOP-62 


fo.o 


Cerebral Cortex Pool 


80.1 


Lung ca. NCI-H522 


|0.7 


Brain (Substantia nigra) Pool 


54.0 


Liver 


jo.o 


Brain (Thalamus) Pool 


94.6 


Fetal Liver 


|0.4 


Brain (whole) 


65.1 


Liver ca. HepG2 


\r\ o 
yj.y 


opmai Lord rooi 


io.o 


Kidney Pool 


1.5 


Adrenal Gland 


0.6 


Fetal Kidney 


0.7 | 


Pituitary gland Pool 


0.9 


Renal ca. 786-0 


jo.o 


Salivary Gland 


0.2 


Renal ca. A498 


fo.o 


Thyroid (female) 


0.0 


Renal ca. ACHN 


■1,0 


Pancreatic ca. CAPAN2 


0.1 


Renal ca. UO-31 


|0.5 


Pancreas Pool 


0.9 



Table AGP. Panel 4.1D 



Tissue Name 


Rel. 

Exp (%) 
Ag5219, 
Run 

229739298 


Tissue Name 


ReL 

Exp.(%) 
Ag5219, 
Run 

229739298 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


3.3 


Secondary Th2 act 


3.2 


HUVEC IFN gamma 


14.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + EFN gamma 


2.9 


Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


1.8 


Secondary Th2 rest 


0.0 


HUVEC DL-11 


21.8 


Secondary Trl rest 


2.9 


Lung Microvascular EC none 


100.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


31.9 


Primary Th2 act 


5.8 


Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


15.5 


Primary Thl rest 


0.0 


Bronchial epithelium TNFalpha +• 
ILlbeta 


0.0 


Primary Th2 rest 


1.8 


Small airway epithelium none 


0.0 


Primary Trl rest 


4.7 


Small airway epithelium TNFalpha 
+ IL-lbeta 


3.4 


CD45RA CD4 lymphocyte act 


0.0 


Coronery artery SMC rest 


2.3 


CD45RO CD4 lymphocyte act 


11.1 


Coronery arteiy SMC TNFalpha + 
IL-lbeta 


0.0 


CD8 lymphocyte act 


6.7 


Astrocytes rest 


0.0 


Secondary CD8 lymphocyte rest 


5.9 


Astrocytes TNFalpha + IL-lbeta 


3.4 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


4.1 


CD4 lymphocyte none 


3.3 


KU-812 (Basophil) 
PMA/ionomycin 


26.1 
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2ry Thl/Th2/Trl_anti-CD95 


5.9 


il- 1U ^ «tJf fas IJ jrf 

CCD1106 (Keratinocytes) none 




LAK cells rest 


3.0 


CCD1106 (Keratinocytes) 

TMCoTnlio J- TT 1 "U/af o 

i iNjrajpna + ul-i Deta 


0.0 


LAK cells EL-2 


2 0 


Liver cirrhosis 


0.0 


LAK cells IL-2+IL-12 


KJAJ 


iMPT T-TOOO nATio 

|iNUi-.nzyz none 


18.2 


LAK cells IL-2-fIFN gamma 


^ 0 


IMPT TJOQO TT /f 


16.7 


LAK cells DL-2+IL-18 


2.7 


JNCI-H292 IL-9 


25.0 


L-A&. cells PMA/ionomycm 


0.0 


jNa-H2921L-]3 


48.3 


NK Cells EL-2 rest 


24.1 


|NCI-H292 IFN gamma 


19.9 


1 wo Way MLR 3 day 


3.5 


jHPAEC none 


8.1 


i wo Way MLR 5 day 


1.5 


jHPAEC TNF alpha + 1L-1 beta 


7.8 


lwo Way MLR 7 day 


0.0 


jLung fibroblast none 


0.0 


PBMCrest 


0.0 


iLung fibroblast TNF alpha + IL-1 
jbeta 


2.0 


PBMC P WM 


li.b 


[Lung fibroblast IL-4 


7.9 


PBMC PHA-L 


0.0 


[Lung fibroblast IL-9 


0.0 


Ramos (B cell) none 


18.2 


|Lung fibroblast IL-13 


0.0 


Ramos (B cell) ionomycin 


59.9 


Lung fibroblast IFN gamma 


2.8 


B lymphocytes PWM 


4.2 


Dermal fibroblast CCD1 070 rest 


0.6 


B IvmDhocvtes CD40T anil TT -4 




Dermal fibroblast CCD1070 TNF 
alpha 


0.0 


EOL-1 dbcAMP 


0.0 


Dermal fibroblast CCD1070 IL-1 
beta 


0. 1 


T?/"YT 1 J"l_ a H jm 

EOL-1 dbcAMP 
PMA/ionomvcin 


4.8 


Dermal fibroblast IFN gamma 


40.6 


Dftndritir ppIIc nnnf 

AS^ilUl iLIV tcio nunc 




Dermal fibroblast IL-4 


25.0 


Dendritic cell* T P*? 


J.U 


Dermal Fibroblasts rest 


2.1 


Dendritic cells anti-CD40 < 


).0 


Sfeutrophils TNFa+LPS 


*>0 

JAM 


Monocytes rest ( 


).0 


Neutrophils rest 


).0 


Monocytes LPS ( 


).0 


Colon ( 


).0 


Macrophages rest ( 


).0 ] 


Lung ( 


j.0 


Macrophages LPS ( 


).0 


rhymus ( 


).0 


HUVECnone ] 


.7 ] 


<idney j 


1.3 


HUVEC starved pKi 







CNS_neurodegeneration_vl.O Summary: Ag5219 This panel confirms the 
expression of this gene at low levels in the brain in an independent group of individuals. 
This gene is found to be slightly down-regulated in the temporal cortex of Alzheimer's 
disease patients. Therefore, up-regulation of this gene or its protein product, or treatment 
with specific agonists for this receptor may be of use in reversing the dementia, memory 
loss, and neuronal death associated with this disease. 
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GeneraLscreenin^paneLvl.5 Summary: Ag^Wlfg^* 1|^6»L?^ 3 
gene is deted in cerebellum (CT=27). High expression of this gene is mainly seen in all the 
region of central nervous system examined, including amygdala, hippocampus, substantia 
nigra, thalamus, cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic 
modulation of this gene product may be useful in the treatment of central nervous system 
disorders such as Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, 
schizophrenia and depression. 

Inaddition, moderate to low levels of expression of this gene is also seen in a 
number of cancer cell lines derived from brain, colon, gastric, lung, ovarian, and prostate 
cancers, squamous cell carcinoma and melanoma. Therefore, therapeutic modulation of this 
gene may be useful in the treatment of these cancers. 

Low levels of expression of this gene is also seen in tissues with 
metabolic/endocrine functions including pancreas, adrenal and pituitary cancers, fetal heart, 
skeletal muscle and gastrointestinal tract Therefore, therapeutic modulation of the activity 
of this gene may prove useful in the treatment of endocrine/metabolically related diseases, 
such as obesity and diabetes. 

Panel 4.1D Summary: Ag5219 Highest expression of this gene is detected in lung 
microvascular endothelial cells (CT=32.4). This gene is expressed at lower levels in 
cytokine activated lung microvascular cells, activated dermal fibroblasts, resting and 
activated mucoepidermoid NCI-H292, activated basophils, starved and EL-1 1 stimulated 
HUVEC cells, Ramos B cells, and resting IL-2 treated NK cells. Therefore, therapeutic 
modulation of this gene may be useful in the treatment of autoimmune and inflammatory 
diseases such as asthma, allergies, inflammatory bowel disease, lupus erythematosus, 
psoriasis, rheumatoid arthritis, and osteoarthritis. 

AH. CG151014-02 and CG151014-03: Metabotropic glutamate 
receptor 3. 

Expression of gene CG151014-02 and CG151014-02 was assessed using the 
primer-probe set Ag5220, described in Table AHA. Results of the RTQ-PCR runs are 
shown in Tables AHB and AHC. Please note that CG151014-03 represents a full-length 
physical clone. 
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Table AHA. Probe Name Ag5220 



Primers 




Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 • -atcaacttcacgggtgcag-3 1 


19 


1399 


384 


Probe 


TET-5 ' -ctttgtggtcttgggctgtttgtttg 
-3 ' -TAMRA 


26 


1453 


385 


Reverse 


5 • -caggatgatgtgaaccttgg-3 1 


20 


1482 


386 



Table AHB.CNS neurodegeneration vLO 



Tissue Name 


ReL 

Exp.(%) 
Ag5220, 
Run 

228020422 


issue Name 


Rel. 

Exp.(%) 
Ag5220, 
Run 

228020422 


tMJ I illppO 


2.0 


Control (Path) 3 Temporal Ctx , 


5.8 


AD 7 Hinnn 


AG ft 


Control (Path) 4 Temporal Ctx 


25.2 


AD ^ u; nnn 
r\xJ _> ruppu 


1 A 

1.0 


AD 1 Occipital Ctx 


5.6 


AD 4 Hippo 


13.5 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


35.4 


AD 3 Occipital Ctx 


3-1 


AD 6 Hippo 


59.9 


AD 4 Occipital Ctx 


24.7 


Control 2 Hippo 


34.2 


AD 5 Occipital Ctx 


17.2 


Control 4 Hippo 


7.0 


AD 6 Occipital Ctx 


61.6 


Control (Path) 3 Hippo 


4.4 


Control 1 Occipital Ctx 


2.6 


AD 1 Temporal Ctx 


6.0 


Control 2 Occipital Ctx 


43.2 


AD 2 Temporal Ctx 


39.2 


Control 3 Occipital Ctx 


10.2 


AD 3 Temporal Ctx 


24 


Control 4 Occipital Ctx 


9.0 


AD 4 Temporal Ctx 


29.9 


Control (Path) 1 Occipital Ctx llOO.O 


AD 5 M Temporal Ctx 


76.3 


Control (Path) 2 Occipital Ctx 


7.7 


AD5SupTemporalCtx 


29.9 


Control (Path) 3 Occipital Ctx 


2.1 


AD 6 Inf Temporal Ctx 


60.3 


Control (Path) 4 Occipital Ctx 


14.2 


AD 6 Sup Temporal Ctx 


69.3 


Control 1 Parietal Ctx 


7.0 


Control 1 Temporal Ctx 


13.2 


Control 2 Parietal Ctx " r 


24.3 


Control 2 Temporal Ctx 


52.9 


Control 3 Parietal Ctx 


15.4 


Control 3 Temporal Ctx 


23.3 


Control (Path) 1 Parietal Ctx 


39.5 


Control 4 Temporal Ctx 


11.7 < 


Control (Path) 2 Parietal Ctx 


15.2 


Control (Path) 1 Temporal Ctx X 


57.1 < 


Control (Path) 3 Parietal Ctx ( 


U 


Control (Path) 2 Temporal Ctx i 


>9.0 ( 


Control (Path) 4 Parietal Ctx 2 


53.0 
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Table AHC. General screening panel vl.5 



Tissue Name 


XVcJ* 

Exp.(%) 
Ag5220, 
Run 

228758228 


issue Name 


Exp.(%) 
Ag5220, 
Run 

228758228 


Adipose 


0.0 


Renal ca. TK-10 


0.0 


Melanoma* Hs688(A).T 


0.0 


Bladder 


0.0 


Melanoma* Hs688(B).T 


0.0 


Gastric ca. Giver met.) NCI-N87 


0.0 


Melanoma* M14 


0.0 


Gastric ca. KATO m 


0.0 


Melanoma* LOXIMVI 


0.0 


Colon ca. SW-948 


0.0 


Melanoma* SK-MEL-5 


0.0 


Colon ca. SW480 


0.0 


Squamous cell carcinoma SCC-4 


0.0 


Colon ca * (SW480 met) SW620 


0.0 


Testis Pool 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone met) PC-3 


0.0 


Colon ca.HCT-116 


0.0 


Prostate Pool 


0.0 


Colon ca. CaCo-2 


0.0 


Placenta 


0.0 


Colon cancer tissue 


0.0 


Uterus Pool 


0.0 


Colon ca.SW1116 


0.0 


Ovarian ca.OVCAR-3 


0.0 


Colon ca. Colo-205 


0.0 


Ovarian ca. SK-OV-3 


0.0 


Colon ca. SW-48 


0.0 


Ovarian ca. OVCAR4 


0.0 


Colon Pool 


0.0 


Ovarian ca. OVCAR-5 


0.0 


Small Intestine Pool 


0.0 


Ovarian ca. IGROV-1 


0.0 


Stomach Pool 


1.6 


Ovarian ca. OVCAR-8 


0.0 


Bone Marrow Pool 


0.0 


Ovary 


0.0 


Fetal Heart 


0.0 


Breast ca. MCF-7 


0.0 _j 


Heart Pool 


0.0 


Breast ca. MDA-MB-23 1 


0.0 j 


Lymph Node Pool 


0.7 


Breast ca. BT549 


0.0 


Fetal Skeletal Muscle 


0.0 


Breast ca. T47D 


0.0 


Skeletal Muscle Pool 


0.0 


Breast ca. MDA-N 


0.0 


Spleen Pool 


0.0 


Breast Pool 


2.3 


Thymus Pool 


0.0 


Trachea 


0.0 


CNS cancer (glio/astro) U87-MG 


0.0 


Lung • 


0.0 


CNS cancer (glio/astro) U-l 1 8-MG 


0.0 


Fetal Lung 


0.0 


CNS cancer (neuro;met) oJv-JN-Ao 


u.u 


Lung ca. NCLN417 


0.0 


CNS cancer (astro) SF-539 


0.0 


Lung ca. LX-1 


0.0 


CNS cancer (astro) SNB-75 


0.0 


Lung ca. NCI-H146 


0.0 


CNS cancer (glio) SNB-19 


0.0 


Lung ca. SHP-77 


0.0 


CNS cancer (glio) SF-295 


0.0 


Lung ca. A549 


0.0 


Brain (Amygdala) Pool 


75.8 


Lung ca. NCLH526 


0.0 


Brain (cerebellum) 


100.0 


Lung ca. NCI-H23 


0.0 


Brain (fetal) 


69.3 


Lung ca. NCI-H460 


0.2 


Brain (Hippocampus) Pool 


53.2 


Lung ca. HOP-62 


0.0 


Cerebral Cortex Pool 


72.2 
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Lungca.NCI-H522 


0.0 


Brain (Substantia nigra) Pool 


80.7 


Liver 


0.0 


Brain (Thalamus) Pool 


96.6 


Fetal Liver 


0.0 


Brain (whole) 


78.5 


Liver ca. HepG2 


0.0 


Spinal Cord Pool 


25.0 


Kidney Pool 


0.0 


Adrenal Gland 


4.3 


Fetal Kidney 


0.5 


Pituitary gland Pool 


0.0 


Renal ca. 786-0 


0.0 


Salivary Gland 


0.0 


Renal ca. A498 


0.0 


Thyroid (female) 


0.0 


Renal ca. ACHN 


0.0 


Pancreatic ca. CAPAN2 


0.0 


Renal ca.UO-31 


0.0 


Pancreas Pool 


0.0 



CNS_neurodegeneration_vl.O Summary: Ag5220 This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals: 
However, no differential expression of this gene was detected between Alzheimer's 
diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel L5 for a discussion of this gene in treatment of central nervous system disorders. 

GeneraLscreening_paneI_vl.5 Summary: Ag5220 Highest expression of this 
gene is deted in cerebellum (CT=27). High expression of this gene is mainly seen in all the 
region of central nervous system examined, including amygdala, hippocampus, substantia 
nigra, thalamus, cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic 
modulation of this gene product may be useful in the treatment of central nervous system 
disorders such as Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, 
schizophrenia and depression. 

Panel 4.1D Summary: Ag5220 Expression of this gene is Iow/undetectable (CTs 
> 35) across all of the samples on this panel. 

AL CG151297-01: CALMODULIN-DEPENDENT 
PHOSPHODIESTERASE. 

Expression of gene CG15 1297-01 was assessed using the primer-probe set Ag7165, 
described in Table AIA. Results of the RTQ-PCR runs arc shown in Table AIB. Please note 
that CG151297-01 represents a full-length physical clone. 

Table AIA. Probe Name Ag7165 



Primers 


Sequencs 


Length 


Start 
Position 


SEQID 
No 
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Forward 


i ■ ______ 

5 1 -agaa.tgtaccgaaaaacattttctct-3 1 * * 


v. US 






Probe 


TET-5 ■ -ttcctcttatagaggaagcctcaaaag 
ccg-3 1 -TAMRA 


30 


536 


388 


Reverse 


5 1 -tgcttgccacataggaagaa-3 • 


20 


570 J 


389 



Table AIB. Panel 4.1 D 



Tissue Name 


ReL 
Ex.(%) 
Ag7165, 
Run 

307719896 


Tissue Name 


ReL 

KSxjp.y /o) 
Ag7165, 
Run 

307719896 


Secondary Thl act 


0.0 


HUVEC IL-lbeta 


0.0 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


0.0 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN gamma 


0.0 


Secondary Thl rest 


p.o 


HUVEC TNF alpha + TLA 


0.0 


Secondary Th2 rest 


0.0 


HUVEC IL-11 


0.0 


Secondary Trl rest 


0.0 


Lung Microvascular EC none 


0.0 


Primary Thl act 


0.0 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.0 


Primary Th2 act 


0.0 


(Microvascular Dermal EC none 


0.0 


Primary Trl act 


0.0 


\^icro^va<inlfli* T")prmal Pf* 1 

jLTIXVI V/O T U» U1(U ,L/(/l llxCLX JLj v> 

TNFalpha + IL-lbeta 


0.0 


Primary Thl rest 


0.0 


Bronchial ^nithplnim TNPalnha 4- 
ILlbeta 


0.0 


Primary Th2 rest 


0.0 


Small airway epithelium none 


0.0 


ji i iiiiaiy lit rcoi 


u.u 


Small airway epithelium TNFalpha 
+ IL-lbeta 


0.0 


CD45RA CD4 lymphocyte act 


0.0 


Coronery artery SMC rest 


0.0 


CD45RO CD4 lymphocyte act 


o.o ; 


Coronery artery SMC TNFalpha + 
IL-lbeta 


0.0 


GD8 lymphocyte act 


0.0 


Astrocytes rest 


0.0 


Secondary CD8 lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL-lbeta 


0.0 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


0.0 


KU-8 12 (Basophil) 
PMA/ionomycin 


o.o 


2ry Thl/Th2/Trl_anti-CD95 
CH11 


0.0 


CCD1106 (Keratinocytes) none 


O.O 


LAK cells rest 


3.0 


CCD1106 (Keratinocytes) 
rNFalpha + IL-lbeta 


3.0 


LAK cells DL-2 


3.0 


Liver cirrhosis 


100.0 


LAK cells 1L-2+EL-12 


).0 


rci-H292 none 


3.0 


LAK cells IL-2+IFN gamma 


3.0 


^CI-H292IL-4 ( 


).0 


LAK cells IL-2+IL-18 ( 


).0 1 


*CI-H292 JL-9 C 


).0 


LAK cells PMA/ionomycin ( 


).0 I 


SO-H292IL-13 ( 


).0 
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NK Cells IL-2 rest 


-j . . frr ir ir r n i n *3 _ 

0.0 [NCI-H292 fr#gaMia lLl U U *~ ' 


' Q"^ t "" JJm «^.4t .t»* .... 


Two Way MLR 3 day 


0.0 |HPAECnone 


0.0 


Two Way MLR 5 day 


0.0 IHPAEC TNF alpha + IL-1 beta 


0.0 


Two Way MLR 7 day 


0.0 |Lung fibroblast none 


0.0 


PBMCrest 


Q Q jLung fibroblast TNF alpha + EL-1 
[beta 


0.0 


PBMCPWM 


0.0 (Lung fibroblast IL-4 


0.0 


PBMC PHA-L 


0.0 jLung fibroblast 1L-9 


0.0 


Ramos (B cell) none 


0.0 jLung fibroblast 1L-13 


0.0 


Ramos (B cell) ionomycin 


0.0 


Lung fibroblast IFN gamma 


0.0 


B lymphocytes PWM 


0.0 


Dermal fibroblast CCD1070 rest 


00 


B lymphocytes CD40L and IL-4 


0.0 


Dermal fibroblast CCD1070 TNF 
alpha 


0.0 


FOT -1 rihrAMP 


0.0 


Dermal fibroblast CCD1070 IL-1 
beta 


0.0 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


Dermal fibroblast IFN gamma 


no 


Dendritic cells none 


6.0 


Dermal fibroblast IL-4 


0.0 j 


Dendritic cells LPS 


0.0 Dermal Fibroblasts rest 


0.0 




0.0 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


0.0 


Neutrophils rest 


0.0 


Monocytes LPS 


0.0 


Colon 


0.0 


Macrophages rest 


0.0 


Lung 


0.0 


Macrophages LPS 


O.O 


Thymus 


o.o 


HUVECnone 


o.o 


Kidney 


3.0 


HUVEC starved 


10 







CNS_neurodegeneration_vl.O Summary: Ag7165 Expression of this gene is 
low/undetectable (CTs > 35) across all of the samples on this panel. 

Panel 4.1D Summary: Ag7165 Moderate level of expression of this gene is 
detected mainly in the liver cirrhosis sample (CT=3 1.5). The presence of this gene in liver 
cirrhosis (a component of which involves liver inflammation arid fibrosis) suggests that 
antibodies to the protein encoded by this gene could also be used for the diagnosis of liver 
cirrhosis. Furthermore, therapeutic agents involving this gene may be useful in reducing or 
inhibiting the inflammation associated with fibrotic and inflammatory diseases. 

AJ. CG152256-01: Phosphatidylserine synthase. 
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Expression of gene CG152256-01 was assessed x!^ih&pfaa3t!$BteseP/iyffftf* 
described in Table AJA. Results of the RTQ-PCR runs are shown in Tables AJB, AJC and 
AID. 

Table AJA. Probe Name Ag6718 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -gagcctcgcttccgattat-3 ' 


19 


2012 


390 


Probe 


TET-5 ' -tcccttcccaatattattcatccaga 
-3 ' -TAMRA 


26 


2031 


391 


Reverse j5 ' -ctctagcaggtttgcttttgtg-3 ' 


22 


2070 


392 



Table AJB. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

A267I8. 

Run 

276596848 


issue Name 


Rel. 

Exp.(%) 

Agu / AO, 

Run 

276596848 


AD 1 Hippo 


19.8 


Control (Path) 3 Temporal Ctx 


2.6 


AD 2 Hippo 


26.6 


Control (Path) 4 Temporal Ctx 


15.3 


AD 3 Hippo 


4.3 


AD 1 Occipital Ctx 


9.9 


AD 4 Hippo 


3.7 


AD 2 Occipital Ctx (Missing) 


0.0 


AD5Hippo 


58.6 


AD 3 Occipital Ctx 


7.1 


AD 6 Hippo 


45.4 __j 


AD 4 Occipital Ctx 


15.9 


Control 2 Hippo 


28.5 


AD 5 Occipital Ctx 


26.6 


Control 4 Hippo 


8.4 


AD 6 Occipital Ctx 


15.1 


Control (Path) 3 Hippo 


3.1 


Control 1 Occipital Ctx 


3.6 


AD 1 Temporal Ctx 


4.8 


Control 2 Occipital Ctx 


67.4 


AD 2 Temporal Ctx 


24.7 


Control 3 Occipital Ctx 


31.2 


AD 3 Temporal Ctx 


7.5 


Control 4 Occipital Ctx 


1.8 1 


AD 4 Temporal Ctx 


10.5 


Control (Path) 1 Occipital Ctx 


100.0 


AD 5 Inf Temporal Ctx 


62.9 


Control (Path) 2 Occipital Ctx 


9.5 


AD 5 Sup Temporal Ctx 


46.3 


Control (Path) 3 Occipital Ctx 


5.3 


AD 6 Inf Temporal Ctx 


43.5 


Control (Path) 4 Occipital Ctx 


10.0 


AD 6 Sup Temporal Ctx 


43.2 


Control 1 Parietal Ctx 


3.8 


Control 1 Temporal Ctx 


4.1 


Control 2 Parietal Ctx 


27.9 


Control 2 Temporal Ctx 


59.0 


Control 3 Parietal Ctx 


15.0 


Control 3 Temporal Ctx 


17.6 


Control (Path) 1 Parietal Ctx 


89.5 


Control 3 Temporal Ctx 


5.0 


Control (Path) 2 Parietal Ctx 


10.2 


Control (Path) 1 Temporal Ctx 


57.0 


Control (Path) 3 Parietal Ctx 


7.0 


Control (Path) 2 Temporal Ctx 


30.4 |Control (Path) 4 Parietal Ctx 


27.9 
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Table A JC. General screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
Ag6718, 
Run 

27722381 3 


issue Name 


ReL 

Exp.(%) 
Ag6718, 
Run 

LI IZZSlS>\5 


Adipose 




XVClIiH Ld. llVxv/- 


*\A A 


Melanoma* Hs688f At T 


1ft 4 


xsjaaaer 




Melanoma* Hs688fB , T 


20 0 


oabuic ca. Oliver mei. j inv^XtTno / 


<A A 


Melanoma* VI 14. 


^0 ft 


OaStnC Ca. 1 \J 111 


4o.3 


Melanoma* T OXTMVT 

xrx\^iasx\jina. x**\js\axvx v x 


JJ.l 


i^ojon ca. 0W-745 




Melanoma* STC-MFF -S 

XVXW-KUlVSXiXCl Ui\ 1Y.I 1 *X* *J 


Kl R 


i^oion ca. ovv*fou 


o/.l 


SfltiaiYlOllC *V»11 f*ar<"Mnr\m5i QfT^ 1 — /I 


9** ^ 


L,oion ca.* {>>W4oU met) o Wu2U 


69.7 


Testis Pool 

X wOLij A IAJX 




i^oion ca. xiizy 


o n 


ProQtatf* cji * ftionf* mpt • Pf*-^ 

a l Ui>ltilt> ta> ^UVJUCi llldy 


lUv.w 


fVil/\n r-i WOT 1 1< 

v^oion ca. jiv^ 1-1 10 


j 1.4 


Procf Afp Pool 


1 R 

JL.O 


1^01 on ca. LJai^o-z 


if n 

15.9 


Placenta 


9 ft 


Colon cancer tissue 


23.5 


TTfpniQ Pool 




i^oion ca. 0 w 1 1 10 


25.0 


Ovarian ca OVPAR-^ 


97 4 


v^oion ca. ^oio-zuj 


oi o 


Ovarian ea SK-OV-"3 


90 0 


t«.oion ca. ovy-4o 


24.1 


Ovarian ea OVPAR-4 


^ 0 


v^oion 1^001 


12.4 


Ovarian ca OVPAR-^ 


SO 0 


omaii iniesune Jrooi 


4.o 


Ovarian ca IGROV-1 


47 6 


oLuiiiocn x 001 


1 Q 

l.O 


Ovarian ca OVPAR-8 


"32 8 


x>one iviarrow r 001 


U.U 


Ovary 


117 

XX./ 


^Clal XlCal L 


"MO 
14.2 


Breast ca. MCF-7 


18 9 


Xxcaxi x UUI 


ll.O 


Breast ca. MDA-MB-231 


48.0 


T vnrmH N^rvlo "Prirxl 
XwjxilUii iiUUB ruui 


D.O 


Breast ca. BT 549 


31 6 


JTCLdi OJVClClal xYXUoulC 


J.J 


Breast ca. T47D 


3.6 


.Slcelpfal Mimrlp Pool 


U.U 


Breast ca. MDA-N 


17.9 


Spleen Pool 


2.0 


Breast Pool 


7.0 


Thymus Pool 


11.7 


Trachea 


9.2 


CNS cancer (glio/astro) U87-MG 


32.3 


Lung 


2.4 


CNS cancer (glio/astro) U-l 18-MG 


43.2 


Fetal Lung 


4.9 


CNS cancer (neuro;met) SK-N-AS 


25.9 


Lungca.NCI-N417 


15.0 


CNS cancer (astro) SF-539 


29.5 


Lungca.LX-1 


17.6 


CNS cancer (astro) SNB-75 


59.0 


Lungca.NCI-H146 


23.7 < 


CNS cancer (glio) SNB-1 9 ; 


29.7 


Lung ca. SHP-77 


53.2 < 


2NS cancer (glio) SF-295 i 


)9.5 


Lung ca. A549 \ 


28.3 1 


3rain (Amygdala) Pool ] 


10.4 


Lungca.NCI-H526 ; 


14.3 I 


3rain (cerebellum) ; 


54.4 


Lungca.NCI-H23 


11J I 


Jrain (fetal) 3 


J.3 
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1 ima ro "MPT TJ/tfiO 


1 A 1 


Brain (Hippocampus) Pool ^j^-u. ~i ... 


JLUIig Ca. JtlUJr-OZ 




Cerebral Cortex Pool 


7.4 


i-/UlJg Id. IN lw/I-JXJZZ 




Brain (Substantia nigra) Pool 


3.9 


juivcjt 


l.U 


Brain (Thalamus) Pool 


6.9. 


remi iA ver | 


2.3 


Brain (whole) 


6.5 


Liver ca. HepG2 j 


19.2 


Spinal Cord Pool 


5.6 


Kidney Pool 


15.2 


Adrenal Gland . 


10.3 


Fetal Kidney ! 


4.1 


Pituitary gland Pool 


1.1 


Renal ca. 786-0 


61.6 


Salivary Gland 


3.2 


Renal ca. A498 


5.6 


Thyroid (female) 


11.5 


Renal ca. ACHN 


24.7 | 


Pancreatic ca. CAPAN2 


28.1 


Renal ca. UO-31 


33.9 


Pancreas Pool 


8.3 



Table A.TD. Panel 4.1D 



Tissue Name 


Rel. 

Ex.(%) 

Ag6718, 

Run 


Tissue Name 


Rel. 

Exp.(%) 
Ag6718, 
Run 

276596888 


Secondary Thl act 


51.4 


HUVEC IL-lbeta 


18.0 


Secondary Th2 act 


39.5 


HUVECIFN gamma 


16.5 


Secondary Trl act 


19.3 


HUVEC TNF alpha + IFN gamma 


4.5 


Secondary Thl rest 


5.3 


HUVECTNF alpha + IL4 


3.1 


Secondary Th2rest 


4.5 


HUVEC IL-11 


0.0 


Secondary Trl rest 


5.9 


Lung Microvascular EC none 


13.9 


Primary Thl act 


3.5 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


0.7 


Primary Th2 act 


20.7 


Microvascular Dermal EC none 


3.0 


Primary Trl act 


12.8 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


1.2 


Primary Thl rest 


1.6 


Bronchial epithelium TNFalpha + 
ILlbeta 


5.8 


Primary Th2 rest 


5.8 


Small airway epithelium none 


6.3 


Primary Trl rest 


0.7 


Small airway epithelium TNFalpha 
-h IL-lbeta 


9.7 


CD45RA CD4 lymphocyte act 


26.4 


Coronery artery SMC rest 


7.1 


CD45RO CD4 lymphocyte act 


30.8 


Coronery artery SMC TNFalpha + 
IL-lbeta 


8.4 


CDS lymphocyte act 


7.6 


Astrocytes rest 


3.3 


Secondary CD8 lymphocyte rest 


6.3 


Astrocytes TNFalpha + IL-lbeta 


2.9 


Secondary CD8 lymphocyte act 


1.5 


KU-8 12 (Basophil) rest 


44.8 


CD4 lymphocyte none 


3.6 


KU-812 (Basophil) 
PMA/ionomycin 


28.1 
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2ry Thl/Th2/Trl_antihCD95 
CH11 




pci / usoe 

CCDllUo t&eratinocytes) none 


/3JL37" 

27.5 


LAK cells rest 


4.5 


CCD1 106 (Keratinocytes) 
TNFalpha + IL-lbeta 


5 1 

•J* X 


LAK cells EL-2 


9.9 


Liver cirrhosis 


0.8 


LAK cells IL-2+IL-12 


0.7 


NCI-H292 none 


8.0 


LAK cells IL-2+IFN gamma 


4.2 


NCI-H2921L-4 


10.2 


LAK cells IL-2+ IL-18 


1.4 


NCI-H292IL-9 


19.2 


LAK cells PMA/ionomycin 


18.7 


NCI-H292IL-13 


14.8 


NK Cells IL-2 rest 


21.0 


NCI-H292IFN gamma 


6.8 


Two Way MLR 3 day 


7.6 


HPAEC none 


3.7 


Two Way MLR 5 day 


5.2 


HPAEC TNF alpha + BL-1 beta 


8.5 


Two Way MLR 7 day 


4.3 


Lung fibroblast none 


6.8 ; 


PBMC rest 


1.4 


Lung fibroblast TNF alpha + 
beta 


1.9 


PBMCPWM 


3.0 


Lung fibroblast IL-4 


6.1 


PBMCPHA-L 


4.1 


Lung fibroblast IL-9 


10.0 


Ramos (B cell) none 


42.9 


Lung fibroblast IL-13 


7.7 


RamnQ rR cf»ll"fc irninTYivpiTi 

iXClAUVJO VvSJLl J XVJLlKJLLlj \*LLL 


22 1 


T nncy "filimlVljici" TPfiW cramma 
JLjUJUc^ AlUX VJl/ldoL _Ll 1 N tLalllliia. 






iv.O 


Dermal fThrriKlnQt fYTHn7ft r<*«t 




B lymphocytes CD40L and IL-4 


12.2 


DpTmal fibroblast CCD 1070 TNF 
alpha 


100.0 


TtSW t -J 1 A TV jm 

EOL-1 dbcAMP 


39.0 


Dermal fibroblast CCD1070 IL-1 
beta 


17.4 


EOL-1 dbcAMP 
PMA/ionomycin 


14 1 


l^ciillal lIUlUUJcLal JUTIN galLUIla 


« 7 


Dendritic cells none 


13.5 


Dermal fibroblast IL-4 


10.4 


Dendritic cells LPS 


2.5 


Dermal Fibroblasts rest 


6.9 


Dendritic cells anti-CD40 


4.5 


Neutrophils TNFa+LPS 


0.4 


Monocytes rest 


0.6 


Neutrophils rest 


0.7 


Monocytes LPS 


3.9 


Colon 


0.8 


Macrophages rest 


1.4 


Lung 


0.6 


Macrophages LPS 


3.8 


Thymus 


2.9 


HUVEC none 


11.1 


Kidney 


8.1 


HUVEC starved 


6.4 







3 



CNS_neurodegeneration_vl.O Summary: Ag6718 This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 
However, no differential expression of this gene was detected between Alzheimer's 
diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel 1 .6 for a discussion of this gene in treatment of central nervous system disorders. 
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Generaljscreening_paneLvl.6 Summary: Ag(M§"^ 
gene is detected in prostate cancer PC3 cell line (CT=3 1 .9). Moderate levels of expression 
of this gene is also seen in cluster of cancer cell lines derived from pancreatic, gastric, 
colon, lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and 
5 brain cancers. Thus, expression of this gene could be used as a marker to detect the 
presence of these cancers. Furthermore, therapeutic modulation of the expression or 
function of this gene may be effective in the treatment of pancreatic, gastric, colon, lung, 
liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain 
cancers. 

10 In addition, this gene is expressed at low levels in cerebellum and fetal brain. 

Therefore, therapeutic modulation of this gene product may be useful in the treatment of 
central nervous system disorders such as ataxia and autism. 

Panel 4.1D Summary: Ag6718 Highest expression of this gene is detected in TNF 
alpha treated dermal fibroblasts (CT=32). Moderate to low levels of expression of this gene 

15 is detected in activated polarized, naive and memory T cells, PMA/ionomycin treated LAK 
cells, resting IL-2 treated NK cells, Ramos B cells, eosinophils, activated HUVEC cells, 
lung microvascular endothelial cells, basophils and activated mucoepidermoid NCI-H292 
cells. Therefore, therapeutic modulation of this gene or its protein product may lead to the 
alteration of functions associated with these cell types and lead to improvement of the 

20 symptoms of patients suffering from autoimmune and inflammatory diseases such as 

asthma, allergies, inflammatory bowel disease, lupus erythematosus, psoriasis, rheumatoid 
arthritis, and osteoarthritis. 

AK. CG173017-01: RETINOIC ACID RECEPTOR 
RXR-BETA. 

25 Expression of gene CG173017-01 was assessed using the primer-probe set Ag7565, 

described in Table AKA. 

Table AKA. Probe Name Ag7565 



Primers 




Length 


Start 
Position 


SEQ ID 
No 


Forward 


5 ' -ctggacgggacgggat-3 ■ 


16 


222 


393 


Probe 


TET-5 1 -acatagccgtttgccagccccag-3 
' -TAMRA 


23 


261 


394 
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everse |5 ' -cttctgtccccgcagatt~3 ' 



CNS_neurodegeneration_vl.O Summary: Ag7565 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag7565 Expression of this gene is low/undetectable in all 
samples'on this panel (CTs>35). 

AL. CG173347-01: Novel Serum paraoxonase/arylesterase 3. 

Expression of gene CG173347-01 was assessed using the primer-probe set Ag7564, 
described in Table ALA. 

Table ALA. Probe Name Ag7564 



Primers 


Sequence 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gaaagtggctctgaagatattgatatact- 
3» 


29 


153 


396 


Probe 


TET-5 ' - tcctagtgggctggcttttatctcc- 
3 '-TAMRA 


25 


182 


397 


Reverse 


5 ' -actccaacagacctgcagact-3 » 


21 


207 


398 



CNS_neurodegeneration_vLO Summary: Ag7564 Expression of this gene is 
low/undetectable in all samples on this panel (CTs>35). 

Panel 4.1D Summary: Ag7564 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

AM. CG56234-02: Splice variant of PCK2. 

Expression of gene CG56234-02 was assessed using the primer-probe set Ag5111, 
described in Table AMA. Results of the RTQ-PCR runs are shown in Tables AMB, AMC, 
AMD and AME. 

Table AMA. Probe Name Ag5111 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -ctgggaggccccaga-3 1 


15 


1377 


399 
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Probe 


TET-5 1 -tgtccccattgacgccatcatc-3 r 
-TAMRA 


1 C IV US 
22 


1395 


400 


Reverse 


5 * -gatgatcttccctttgggtct-3 ' 


21 


1429 


401 



Table AMB. General screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
AgSlll, 
Run 

228980587 


issue Name 


KeL 

Ag5111, 
Run 

228980587 


Adipose 


2.0 


Renal ca. TK-10 


29.1 


Melanoma* Hs688(A).T 


31.9 


Bladder 


12.1 


Melanoma* Hs688(B).T 


28.3 


Gastric ca. (liver met.) NCI-N87 


31.4 


Melanoma* M14 


9.9 


Gastric ca. KATO EI 


28.1 


Melanoma* LOXIMVI 


4.5 


Colon ca. SW-948 


17.9 


Melanoma* SK-MEL-5 


39.8 


Colon ca. SW480 


14.9 


Squamous cell carcinoma SCC-4 


4.7 


Colon ca * (SW480 met) SW620 


29.5 


Testis Pool Jl.6 


Colon ca. HT29 


8.6 


Prostate ca * (bone met) PC-3 |55. 1 


Colon ca.HCT-1 16 


11.0 


Prostate Pool Jo.5 


Colon ca. CaCo-2 


44.4 


Placenta {Colon cancer tissue 


9.7 


Uterus Pool 


|0.6 


Colon ca.SWl 116 


1.4 


Ovarian ca. OVCAR-3 


|13.6 


Colon ca. Colo-205 


6.6 


Ovarian ca. SK-OV-3 


5.3 


Colon ca. SW-48 


14.4 


Ovarian ca. OVCAR-4 


7.1 


Colon Pool jo.l 


Ovarian ca. OVCAR-5 


34.6 


Small Intestine Pool 


0.6 


Ovarian ca. IGROV-1 


22.5 


Stomach Pool 


1.1 


Ovarian ca. OVCAR-8 


100.0 


Bone Marrow Pool 


0.5 


Ovary 


0.0 


Fetal Heart 


0.0 


Breast ca. MCF-7 


87.7 


Heart Pool 


0.0 


Breast ca. MDA-MB-231 


12.6 


Lymph Node Pool 


0.8 


Breast ca. BT 549 


75.8 


Fetal Skeletal Muscle 


0.6 


Breast ca.T47D 


10.1 


Skeletal Muscle Pool 


3.4 


Breast ca. MDA-N \ 


16.4 


Spleen Pool 


1.7 


Breast Pool 


0.5 


Thymus Pool 


).4 


Trachea 


4.3 


CNS cancer (glio/astro) U87-MG . 


18.8 


Lung 


10 


CNS cancer (glio/astro) U- 11 8-MG < 


13 


Fetal Lung 


2.0 


2NS cancer (neuro;met) SK-N-AS ' 


1.5 


Lung ca. NCI-N417 


1.8 < 


3NS cancer (astro) SF-539 ] 


11.3 


Lung ca. LX-1 I 


5.2 ( 


:NS cancer (astro) SNB-75 A 




Lung ca. NCI-H146 ] 


11.1 ( 


:NS cancer (glio) SNB-19 131.0 


Lung ca. SHP-77 1 


L1.3 ( 


:NS cancer (glio) SF-295 |32.5 
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j_-UTig ca. A04y 


1 1 A 

11.4 


jlirajn (Amygdala; .rool 


0.4 


Lung ca. iNCl-JtDzo 


l.o 


jBrain (cerebellum) 


0.3 • 


T liner MPT 

Lung ca. in^i-jizj 


OJ.D 


]Jt>rain (.ietaij 


0.5 


jLrUng ca; iNv^l-Jrl^DU 




JBrain (Hippocampus) Pool 


Z.D 


Lung ca. riDr-oz 


i a 
l.U 


jcereDrai cortex i ooi 


A A 

0.4 


Lung ca. iNt^i-jiJZ/i 


£7 /I 

0/.4 


jBrain (Substantia nigra) Pool 


a a 
0.0 


Liver 


O.J 


JBram ( 1 nalamus) rool 


1.0 


Fetal Liver 


< 7 
O. f 


JBrain (whole) 


A H 

0.7 


■L-iver ca. jiepijrz 




jopinai i_,ora rooi 


1.1 


Kidney Pool 


0.8 


lAdrenal Gland 


1.6 


Fetal Kidney 


1.0 


jPituitary gland Pool 


0.4 


Renal ca. 786-0 


8.7 


(Salivary Gland 


0.9 


Renal ca.A498 


1.5 


{Thyroid (female) 


0.7 


Renal ca. ACHN 


9.3 


jPancreatic ca. CAPAN2 


12.8 


Renal ca. UO-31 


1.9 


JPancreas Pool 


0.8 



Table AMC, General screening panel vl.6 



Tissue Name 


Rel. 

Exp.(%) 
AgSlll, 
Run 

27721871 
7 


Rel. 

Exp.(%) 
AgSlll, 
Run 

27773124 
6 


Rel. 
Exp.O 
AgSlll, 
Run 

27836861 
4 


Tissue Name 


Rel. 

Exp.(%) 
AgSlll, 
Run 

27721871 
7 


Rel. 
Exp.(%) 
AgSlll, 
Run 

27773124 
6 


Rel. 

Exp.(%) 
Ag5111, 
Run 

27836861 
4 


Adipose 


0.5 


0.0 


1.5 


Renal ca. 
TK-10 


24.7 


20.2 


33.0 


Melanoma* 
Hs688(A).T 


26.1 


29.5 


31.6 


Bladder 


6.7 


6.1 


11.6 


Melanoma* 
Hs688(B).T 


25.2 


32.1 


31.9 


Gastric ca. 
(liver met.) 
NCI-N87 


21.3 


22.5 


36.1 


Melanoma* 
M14 


5.6 


9.7 


7.5 


Gastric ca. 
KATOHI 


14.6 


12.2 


19.2 


Melanoma* 
LOXIMV1 


3.0 


0.0 


4.2 


Colon ca. 
SW-948 


18.8 


16.5 


23.5 


Melanoma* 
SK-MEL-5 


28.7 


57.0 


39.8 


Colon ca. 
SW480 


11.8 


7.3 


19.5 


Squamous cell 

carcinoma 

SCC-4 


4.8 


4.2 


5.1 


Colon ca.* 
(SW480 met) 
SW620 


23.0 


19.9 


35.6 


Testis Pool 


2.0 


0.0 


1.4 


Colon ca. 
HT29 


10.2 


4.2 


8.2 


Prostate ca.* 
(bone met) 
PC-3 


33.2 


44.4 


57.8 


Colon ca. 
HCT-116 


9.6 


7.6 


19.9 
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Prostate Pool 


0.3 


0.0 


0.6 


"Chlonca^' 0 
i_xuon ca. 

CaCo-2 


9.4 


25.0 


36.9 


Placenta 


0.3 


0.0 


1.1 


V-UIOII cancer 

tissue 


6.0 


0.0 


6.6 


Uterus Pool 


0.0 


0.0 


0.6 


i^oion ca. 
SW1116 


2.3 


0.0 


1.7 


Ovarian ca. 
OVCAR-3 


12.7 


8.2 


18.2 


Colon ca. 
Colo-205 


5.1 


4.7 


5.9 


Ovarian ca. 
SK-OV-3 


5.3 


6.5 


12.2 


i-*oion ca. 
SW-48 


9.0 


0.0 


11.6 


Ovarian ca. 
OVCAR-4 


4.0 


5.2 


5.8 


Colon Pool 


0.7 


0.0 


0.7 


Ovarian ca. 
OVCAR-5 


31.6 


24.8 


34.2 


Small Intestine 
Pool 


0.3 


0.0 


0.8 


Ovarian ca. 
IGROV-1 


19.2 


12.8 


27.2 


Stomach Pool 


1.2 


0.0 


2.3 


Ovarian ca. 
OVCAR-8 


100.0 


100.0 


100.0 


Bone Marrow 
Pool 


0.0 


0.0 


0.0 


Ovary 


0.0 


0.0 


0.2 


Fetal Heart 


0.0 


0.0 


0.3 


Breast ca 
MCF-7 


54.0 


51.4 


77.9 


Heart Pool 


0.4 


0.0 


0.0 


MDA-MB-231 


8.5 


7.6 


7.7 


Lymph Node 
Pool 


1.2 


0.0 


0.0 


Breast ca. BT 
549 


47.0 


30.4 


49.0 


Fetal Skeletal 
Muscle 


0.0 


0.0 


0.0 


Breast ca. T47D 


5.1 


6.5 


7.1 


okeletal 
Muscle Pool 


0.0 


0.0 


0.0 


Breast ca. 
MDA-N 


6.1 


6.0 


24.5 


Spleen Pool 


0.7 


0.0 


25 


Breast Pool 


0.3 


0.0 


0.3 


Thymus Pool 


0.5 


0.0 


1.8 


tracnea 


3.3 


0.0 


8.3 


CNS cancer 

(glio/astro) 

U87-MG 


J2.9 


7.9 


13.8 


Lung 


10 


5.0 


3.0 


CNS cancer 

(glio/astro) 

U-118-MG 


5.9 


4.4 


3.1 


Fetal Lung 


).9 ( 


).0 


U ( 

< 


CNS cancer 
;neuro;met) 


5.4 


1.9 


5.7 


Lung ca. 

NCI-N417 1 


L3 ( 


).o : 


,» ; 


INS cancer t 
astro) SF-539 * 


>.8 f 


5.4 


5.5 


Lung ca. LX-1 J 


i.5 1 


'.8 $ 


( 

>.5 


INS cancer 
astro) 2 
JNB-75 


15.0 2 


•9.9 2 


.6.8 


Lungca. 

NCI-H146 8 


.0 8 


.5 1 


- \ 


INS cancer „ 
gIio)SNB-19 


3.8 2 


.0.7 2 


9.5 


Lung ca. 

SHP-77 1 


2.2 1 


4.3 2 


( 


INS cancer , 
glio)SF-295 


8.2 2 


8.7 4 


6.7 
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Lung ca. A549 


11.5 


11.7 


15.9 


Brain 

(Amygdala) 
Pool 


P/U3 
0.8 


0.0 


137: 
1.1 


Lung ca. 
NCI-H526 


1.8 


0.0 


1.7 


Brain 

(cerebellum) 


1.0 s 


0.0 


1.1 


Lung ca. 
NCI-H23 


42.6 


68.8 


55.1 


Brain (fetal) 


0.0 


0.0 


0.4 


Lung ca. 
NC1-H460 


16.7 


23.5 


38.4 


Brain 

(Hippocampus 
)Pool 


0.4 


0.0 


1.2 


T i in ct 

jjung cd. 
HOP-62 


2.0 


0.0 


3.0 


v^ereDrai 
Cortex Pool 


0.0 


0.0 


0.6 


Lung ca. 
NCI-H522 


41.5 


64.2 


87.1 


Brain 
(Substantia 
nigra) Pool 


0.0 


0.0 


0.4 


Liver 


4.4 


4.6 


7.1 

„„ 


Brain 

(Thalamus) 

Pool 


0.0 


0.0 


0.0 


Fetal Liver 


5.8 


33 


8.7 


Brain, (whole) 


6.7 


0.0 


2.8 


Liver ca. 
HepG2 


if T 

15./ 


10.3 


IOC? 


Spinal Cord 
Pool 


0.6 


o.o j 


0.5 


Kidney Pool 


0.7 


0.0 


0.3 


Adrenal Gland 


1.4 


0.0 


1.4 


Fetal Kidney 


0.9 


0.0 


1.0 


Pituitary gland 
Pool 


0.0 


0.0 


0.7 


Renal ca. 786-0 


9.3 


8.1 


13.8 


Salivary Gland 


0.8 


0.0 


1.8 


Renal ca. A49S 


1.1 j 


0.0 


2.0 


Thyroid 
(female) 


1.0 


0.0 


2.1 


Renal ca. 
ACHN 


5.8 j 


6.0 


10-8 


Pancreatic ca. 
CAPAN2 


13.1 


9.6 


19.9 


Renal ca. 
UO-31 


2.4 


0.0 


3.3 


Pancreas Pool 


4.8 


0.0 


7.3 



Table AMD. Panel 4.1D 



Tissue Name 


Rel. 

Exp.(%) 

gsm, 

Run 

226444761 


Rel. 

Exp.(%) 
AgSlll, 
Run 

276596864 


Tissue Name 


Rel. 

Ag5111, 
Run 

226444761 


Rel. 

Exp.(%) 
Ag5111, 
Run 

276596864 


Secondary Thl act 


90.8 


58.6 


HUVEC IL-lbeta 


18.7 . 


10.7 


Secondary Th2 act 


40.9 


57.8 


HUVEC EFN gamma 


2.8 


6.2 


Secondary Trl act 


57.4 


16.5 


HUVECTNF alpha + 
IFN gamma 


5.0 


6.2 


Secondary Thl rest 


27.2 


8.4 


HUVECTNF alpha + 
IL4 


23.2 


8.8 


Secondary Th2 rest 


6.0 


0.0 


HUVEC IL-11 


2.3 


0.0 
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Secondary Trl rest 


7.2 


4.0 


Lung Ma&cl^Mar^ 
EC none 


3.2 


15.4 


Primary Thl act 


32.8 


5.0 


Lung Microvascular 
EC TNFalpha + 
IL-lbeta 


6.4 


0.0 


Primary Th2 act 


49.0 


19.9 


Microvascular 
Dermal EC none 


6.6 


0.0 


Primary Trl act 


50.0 


38.4 


Microsvasular 
Dermal EC 
TNFaloha + TL-lheta 


0.0 


0.0 


Primary Thl rest 


6.0 


8.5 


Bronchial epithelium 


8.7 


6.9 


Primary Th2 rest 


6.4 


6.3 

, — _ , , 


Small airway 

f^rMthpliTiTTi nfvti/* 
V/L/AUJGJ1UIJ1 UU11C 


2.2 


0.0 


Primary Trl rest 


18.0 


0.0 


iSmftll flinvav 

epithelium TNFalpha 
+ IL-lbeta 


11.8 


0.0 [ 


CD45RACD4 
lymphocyte act 


95.9 


76.8 


Coronery artery SMC 
rest 


18.3 


10.2 


CD45RO CD4 
lymphocyte act 


95.3 


100.0 


Coronery artery SMC 
TNFalpha + IL-lbeta 


9.4 


8.8 


CD8 lymphocyte act 


77.4 


4.5 


Astrocytes rest 


2.1 


0.0 


Secondary CD8 
lymphocyte rest 


90.1 


17.3 


Astrocytes TNFalpha 

_L TT _1 kptn 

T .JUL- 1 UCU-t 


0.0 


0.0 1 


Secondary CD8 
lymphocyte act 


21.0 


7.7 


KU-812 (Basophil) 
rest 


25.9 


10.2 | 


CD4 lymphocyte none 


0.0 


0.0 

_________ 


KU-812 (Basophil) 

JTXVJL_rV lUXiUIliyUlIl 


26.8 


21.2 


2ry 

Thl/Th2/Trl_anti-CD95 
CH11 


5.4 


0.0 


CCD1106 

(Keratinocytes) none 


15.2 


4.9 


LAK. cells rest 


43.5 


19.9 


CCD1106 
(Keratinocytes) 
rNFalpha + IL-lbeta 


9.0 


12.3 


LAK cells EL-2 


52.1 


18.4 


Liver cirrhosis 


8.3 


3.0 


LAK cells IL-2+IL-12 


33.7 


3.0 ] 


^CI-H292none 


15.3 : 


Ta J 


LAK cells 1L-2+IFN 
gamma 


y r.\J ( 


J.O ] 


NCI-H292 IL-4 


13.5 ] 


17.2 


LAK cells IL-2+ EL-1 8 l 


16.0 S 


).5 I 


^CI-H2921L-9 1 


14.2 1 


14.1 ] 


LAK cells 
PMA/ionomycin 


[35 • ■ J 


54.5 I 


^CI-H292 IL-13 "i 




i i \ 

1.3 | 


NK Cells IL-2 rest ( 


i0.7 3 


.7.4 1 


4CI-H292 IFN 
;amma 


14.8 7 


.2 


Two Way MLR 3 day 3 


2.1 1 


0.3 1 


BPAECnone 2 


.0 0 


.0 j 


Two Way MLR 5 day 5 


3.2 3 


■« I 


IPAEC TNF alpha + - 
L-l beta 7 


.2 . 7 


.0 


Two Way MLR 7 day 2 


3.5 9 


.6 L 


<ung fibroblast none 2 


1.2 1 


5.9 
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PBMCrest 


6.1 


0.0 


« O IP T + K 1! 

LungfibrobWTNl^ 
alpha + 1L-1 beta 


WiR V,nH It.-, / 

11.5 


0.0 


PBMCPWM 


23.5 


9.1 


Lung fibroblast EL-4 


2.4 


0.0 


PBMC PHA-L 


35.8 


12.2 


Lung fibroblast IL-9 


17.6 


5.4 


Ramos (B cell) none 


58.6 


16.7 


Lung fibroblast IL-13 


13.4 


0.0 


Ramos CB celD 
ionomycin 


71.7 


92.7 


Lung fibroblast IFN 
gamma 


11.6 


3.1 


B lymphocytes PWM 


21.6 


14.8 


Dermal fibroblast 
CCD1 070 rest 


99.3 


64.6 


B Ivmohocvtes CD40L 
and IL4 


29.7 


23.2 


Dermal fibroblast 
CCD 1070 TNF alpha 


74.7 


88.9 


EOL-1 dbcAMP 


32.3 


32.8 


Dermal fibroblast 
CCD 1070 IL-1 beta 


29.9 


50.0 


EOL-1 dbcAMP 
PMA/ionomycin 


10.6 


3.2 


Dermal fibroblast 
IFN gamma 


13.3 


0.0 


Dendritic cells none 


66.0 


24.5 


Dermal fibroblast 
IL-4 


J.Z.Z 


IMJ 






00 

v/.v/ 


Dermal Fibroblasts 
rest 


0.0 


0.0 


Dendritic cells 
anti-CD4U 


48.3 


28.1 


Neutrophils 
l iNjra+ijro 


0.0 


0.0 


Monocytes rest 


29.1 


0.0 


Neutrophils rest 


0.0 


0.0 


Monocytes LPS 


37.6 


18.0 


Colon 


32.3 


8.2 


Macrophages rest 


100.0 


12.9 


Lung 


3.5 


0.0 


Macrophages LPS 


28.1 


16.2 


Thymus 


12.1 


0.0 


HUVECnone 


7.9 


5.7 


Kidney 


83.5 


31.9 


HUVEC starved 


17.4 


8.4 









Table AME. general oncology screening panel v 2,4 



Tissue Name 


Rel. 

Exp.(%) 
AgSlll, 
Run 

260280403 


Tissue Nme 


Rel. 

Exp.(%) 
AgSlll, 
Run 

260280403 


Colon cancer 1 


49.0 


Bladder cancer NAT 2 


0.0 


Colon cancer NAT 1 


2.5 


Bladder cancer NAT 3 


0.0 


Colon cancer 2 


11.7 


Bladder cancer NAT 4 


0.0 


Colon cancer NAT 2 


28.5 


Prostate adenocarcinoma 1 


5.0 


Colon cancer 3 


43.5 


Prostate adenocarcinoma 2 


0.0 


Colon cancer NAT 3 


53.2 


Prostate adenocarcinoma 3 


0.0 


Colon malignant cancer 4 


100.0 


Prostate adenocarcinoma 4 


0.0 


Colon normal adjacent tissue 4 


8.4 


Prostate cancer NAT 5 


0.0 


Lung cancer 1 


12.2 


Prostate adenocarcinoma 6 


0.0 


Lung NAT 1 


0.0 


Prostate adenocarcinoma 7 


0.0 
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ijung cancer z 


72.2 




T iino MAT 


0.0 


Prostate adenocarcinoma 9 |4.0 


OlJUcUJlUUb (^CIJ CdTLIIlOnia J 


lO Q 

JLo.o 


Prostate cancer NAT 10 0.0 


Lung NAT 3 


0.0 


Kidney cancer 1 


7.5 


metastatic melanoma 1 


0.0 


KidneyNAT 1 


0.0 


Melanoma 2 


6.3 


Kidney cancer 2 


73.2 


Melanoma 3 


0.0 


Kidney NAT 2 


9.2 


metastatic melanoma 4 


0.0 


Kidney cancer 3 


6.3 


metastatic melanoma 5 


2.0 


Kidney NAT 3 


0.0 


Bladder cancer 1 JO.O 


Kidney cancer 4 


7.6 


Bladder cancer NAT 1 jo.O 


Kidney NAT 4 


84.1 


Bladder cancer 2 J0.6 







CNS_neurodegeneration_vLO Summary: Ag5 1 1 1 Expression of the 
CG56234-02 gene is low/undetectable in all samples on this panel (CTs>35). 

General_screenin&_panel_vl^ Summary: Ag5111 Highest expression of the 
CG56234-02 gene is seen in an ovarian cancer cell line (CT=30). This gene encodes a 
splice variant of PEPCK2, the rate-limiting enzyme for gluconeogenesis that has been 
shown to be regulated in response to hormones and environmental stress. In addition, to the 
ovarian cancer cell line, this gene is expressed at a moderate level in most of the cancer cell 
lines used in this panel. Therefore, modulation of the gene product using small molecule 
drugs may affect the growth and survival of cancer cells. Expression of this gene could 
potentially be used as a diagnostic marker of the metabolic status of cells and inhibition of 
activity of this gene prodcut might be used for therapeutic treatment of cancers. 

This gene is also moderately expressed (CT values = 34) in adult and fetal liver. 
Inhibition of this enzyme could potentially decrease hepatic glucose production and thus 
serve as an effective treatment for Type 2 diabetes, which is characterized by excess 
hepatic glucose production. 

General_screening^.paneLvl.6 Summary: Ag51 1 1 Three experiments with the 
same probe and primer produce results that are in excellent agreement. Highest expression 
is seen in an ovarian cancer cell line (CTs=31-34) and overall, expression of this gene 
appears to be more highly associated with cancer cell line samples than with normal tissue 
samples. These results are also in agreement with results in Panel 1.5. Please see that panel 
for discussion of this gene. 
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Panel 4.1D Summary: Ag5111 This gene is ex^lit ^PeSI/n MIP 7 3 
range of cell across this panel (CTs=33.3~34.4), including CD4 T cells (naive and memory 
T cells), CD8 T cells, B cells and macrophages. Expression of this transcript is also found 
in dermal fibroblasts and kidney. This transcript encodes a homolog of a key enzyme in 
glucogenesis and therefore may be important for the metabolic status of all these cell types 
which contribute to the inflammatory response. Therefore, modulation of the activity or 
expression of this putative protein by small molecules could affect the activity of these cells 
and be useful for the treatment of autoimmune diseases such as inflammatory bowel 
diseases, rheumatoid arthritis, asthma, COPD, psoriasis and lupus. 

general oncology screening paneLvJU Summary: Ag51 1 1 Low but significant 
expression is seen in a colon cancer, a kidney cancer, and a lung cancer (CTs=34-35). This 
is in agreement with the preferential expression in cancer cell lines seen in Panels 1.5 and 
1.6. Please see Panel 1.5 for discussion of this gene in oncology. 

AN. CG56836-03: Cathepsin B. 

Expression of gene CG56836-03 was assessed using the primer-probe sets Ag2052 
and Ag5278, described in Tables ANA, ANB and ANC. Results of the RTQ-PCR runs are 
shown in Tables AND, ANE, ANF, ANG, ANH, ANI, ANJ and ANK. 

Table ANA. Probe Name Ag2052 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gtcccaccatcaaagagatca-3 1 


21 


414 


402 


Probe 


TET-5 1 -agaccagggctcctgtggctccfc-3 
' -TAMRA 


23 


436 


403 


Reverse 


5 ' -atgcagatccggtcagagat-3 ' 


20 


485 


404 



Table ANB. Probe Name Ae5277 



Primers 




Length 


Start 
Position 


SEQBD 
No 


Forward 


5 ' -gatctgcatccacaccaat-3 ' 


19 


390 


405 


Probe 


TET-5 ' -cctgctcacctgcctgctctacaagt 
-3 ' -TAMRA : 


26 


441 


406 


Reverse 


5 ' -cagtcagtgttccaggagtt-3 • 


20 


568 


407 
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Table ANC. Probe Name Ag5278 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -tatgaatccaatagcgaga-3 ■ 


19 


653 i 


408 


Probe 


TET-5 ' -agctttctctgtgtattcggacttcc 
-3 ' -TAMRA 


26 


715 


409 


Reverse 


5 ' -tgttggtacactcctgactt-3 ' 


20 


749 


410 



Table AND, AI comprehensive panel vl.O 



Tissue Name 


ReL 

Exp.(%) 
Ag2052, 
Kun 

275804031 


issue Name 


ReL 

Exp.(%) 
Ag2052, 
Rnn 


110967 COPD-F 


10.2 


112427 Match Pontml Psoriacic-P 




110980 COPD-F 


6.4 . 


112418 Psoriasis-M 

A, A -C*T A U A Owl IdOlD XVI 


in a 


110968 COPD-M 


12.0 


112723 Match Control P<:nriflQi<i-lvr 

■A A £-i l i-i^J AV A.<AW*\\. VvV/11 W \JX X !>\Ja J doi.Ol.VJL 


S 0 


110977 COPD-M 


14.0 


1 1 24 1 9 Psoriasis-M 


19 O 1 


1 10989 Emphysema-F 


15.6 


1 12424 Match Control Psoriasis-M 


4.3 


110992 Emphysema-F 


20.0 


112420 Psoriasis-M 


29.7 


110993 Emphysema-F 


13.8 


112425 Match Control Psoriasis-M 


14.8 


110994 Emphysema-F 


6.0 


104689 (MF) OA Bone-Backus 


29.9 


110995 Emphysema-F 


33.2 


104690 (MF) Adj "Normal" 
Bone-Backus 


15.4 


110996 Emphysema-F 


8.5 


104691 (MF) OA Synovium-Backus 


55.9 


110997 Asthma-M 


6.1 


104692 (BA) OA Cartilage-Backus 


27.9 


111001 Asthma-F 


6.7 


104694 (BA) OA Bone-Backus 


39.5 


111002 Asthma-F 


11.2 


104695 (BA) Adj "Normal" 
Bone-Backus 


23.0 


1 1 1003 Atopic Asthma-F 


9.7 


104696 (BA) OA Synovium-Backus 


100.0 


1 1 1004 Atopic Asthma-F 


12.2 


104700 (SS) OA Bone-Backus 


12.2 


111005 Atopic Asthma-F 


7.4 


104701 (SS) Adj "Normal" 
Bone-Backus 


24.3 


1 1 1006 Atopic Asthma-F 


1.7 I 


104702 (SS) OA Synovium-Backus 


43.8 


111417 AUergy-M 


9.0 


117093 OA Cartilage Rep7 


18.4 


112347 Allergy-M 


0.O 


112672 OA Bone5 


173 


112349 Normal Lung-F 


3.0 


12673 OA Synoviumi < 


5.6 


112357 Normal Lung-F 


10.7 


L 12674 OA Synovial Fluid cells5 \ 


5.4 


1 12354 Normal Lung-M 


J.6 j 


[ 17100 OA Cartilage Repl4 I 


5.4 


112374 Crohns-F 


10.6 1 


H2756 0ABone9 1 


13.4 
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112389 Match Control Crohns-F 


14.1 


1 12757 OA Synovium^ U ^ U 


4Q.!i J. 3 J j. 


112375 Crohns-F 


9.9 


1 12758 OA Synovial Fluid Cells9 


5.0 


1 12732 Match Control Crohns-F 


6.6 


1 17125 RA Cartilage Rep2 


19.5 


112725 Crohns-M 


13 


1 13492 Bone2RA 


11.7 


112387 Match Control 
Crohns-M 


11.7 


113493 Synovium2 RA 


3.6 


112378 Crohns-M 


0.0 


113494 SynHuid Cells RA 


6.7 


112390 Match Control 
Crohns-M 


14.5 


113499 Cartilage4RA 


6.7 


112726 Crohns-M 


11.5 


1 13500 Bone4 RA 


6.3 


112731 Match Control 
Crohns-M 


7.5 


113501 Synovium4RA 


5.1 


112380 Ulcer Col-F 


8.7 


1 13502 Syn Fluid Cells4RA 


3.4 


112734 Match Control Ulcer 
Col-F 


15.4 


1 13495 Cartilage3RA 


7.2 


112384 Ulcer Col-F 


25.7 


113496 Bone3RA 


7.0 


112737 Match Control Ulcer 
Col-F 


4.1 


113497 Synovium3 RA 


4.4 


112386 Ulcer Col-F 


7.1 


113498 Syn Fluid Cells3RA 


9.7 


1 12738 Match Control Ulcer 
Col-F 


13.1 


1 17106 Normal Cartilage Rep20 


8.1 


112381 Ulcer Col-M 


0.1 


1 13663 Bone3 Normal 


0.0 


112735 Match Control Ulcer 
Col-M 


0.4 


1 13664 Synovium3 Normal 


0.0 


112382 Ulcer Col-M 


12.9 


1 13665 Syn Fluid Cells3 Normal 


0.0 


112394 Match Control Ulcer 
Col-M 


3.3 


1 17107 Normal Cartilage Rep22 


3.2 


112383 Ulcer Col-M 


30.4 


1 13667 Bone4 Normal 


6.3 


112736 Match Control Ulcer 
Col-M 


11.0 


1 1 3668 Synovium4 Normal 


8.1 


112423 Psoriasis-F 


5.5 


1 13669 Syn Fluid Cells4 Normal 


12.9 



Table ANE. General_screening panel vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5278, 
Run 

230509757 


issue Name 


Rel. 

Exp.(%) 
Ag5278, 
Run 

230509757 


Adipose 


0.2 


Renal ca. TK-10 


6.2 


Melanoma* Hs688(A).T 


24.0 


Bladder 


5.1 


Melanoma* Hs688(B).T 


12.9 


Gastric ca. (liver met.) NCI-N87 


9.7 


Melanoma* M14 


51.8 


Gastric ca.KATOffl 


5.7 


Melanoma* LOXDVTVl 


26.6 


Colon ca. SW-948 


2.1 


Melanoma* SK-MEL-5 


17.0 


Colon ca. SW480 


7.0 
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Squamous cell carcinoma SCC-4 


i 1 

5.JL 


Colon ca.* \o W4ou met) oWozo 


372 


i estis JrOOl 


A £ 


Colon ca. HI 29 


0.7 


rrostate ca. voone met) Jrc-j 


A A 


Colon ca. idci-iio 


2.6 


Prostate Pool 


u.z 


Colon ca. Caco-z 


5.3 


Placenta 


^ A 


Colon cancer tissue 


1 A C 

14.5 


Uterus Pool 


A A 


Colon ca. oWlllo 


2.3 


Uvanan ca. UYCAK-3 


16.3 


Colon ca. Colo-205 


7.9 


Uvanan ca. oiv-UV-3 


18.7 


Colon ca. S W-48 


2.7 


Uvanan ca. UVCAlv-4 


O A 

3.9 


Colon Pool 


1.8 


Uvanan ca. UVCAKo 


5.7 


Small Intestine Pool 


0.7 


Uvanan ca. KjKOV-I 


A 1 

0.3 


Stomach Pool 


1.2 


Uvanan ca. UvlAK-s 


1.3 


Bone Marrow Pool 


0.3 


Ovary 


3.2 


Fetal Heart 


0.5 


Breast ca. MCF-7 


3.0 


Heart Pool 


1.2 


Breast ca. MDA-MB-231 


4-1 


Lymph Node Pool 


2.9 


Breast ca. BT 549 


100.0 


Fetal Skeletal Muscle 


0.3 


Breast ca. T47D 


2.0 


Skeletal Muscle Pool 


1.0 


T-> . _ ■» yTTX A \T 

Breast ca. MDA-N 


1.6 


Spleen Pool 


2.1 


Breast Pool 


2.0 


Thymus Pool 


1.4 


Trachea 


2.3 


CNS cancer (gho/astro) U87-MG 


8.1 


Lung 


0.5 


CNS cancer (gho/astro) U-118-MG 


12.3 


Fetal Lung 


2.2 


CNS cancer (neuro;met) SK-N-AS 


2.0 


T ~ \T/-«T XT A ft 

Lung ca. NCI-N417 


0.1 


CNS cancer (astro) SF-539 


3.4 


Lung ca. LX-1 


6.1 


CNS cancer (astro) SNB-75 


27.4 


T XT /"IT TT1 A Z" 

Lung ca. NCI-H146 


0.4 


CNS cancer (glio) SNB-19 


2.4 


Lung ca. SHr-77 


1.8 


CNS cancer (gho) SF-295 


26.8 


Lung ca. A549 


4.1 j 


Bram (Amygdala) Pool 


2.1 


Lung ca. NCI-H526 


0.1 


Brain (cerebellum) 


6.9 


Lung ca. NCI-H23 


3.0 


Brain (fetal) 


1.2 


Lung ca. NCJHri4ou 


2.6 


Brain (Hippocampus) Pool 


1.9 


Lung ca. ±iUr-o2 


A A 

4.0 


Cerebral Cortex Pool 


3.8 


Lung ca. JNC1-H522 


l.O 


Brain (Substantia nigra) Pool 


2.6 


Liver 


1.4 


Brain (Thalamus) Pool 


2.8 


Fetal Liver 


1 A /I 

10.4 


Brain (whole) 


5.3 


T ivpr rn TTpr>(~ir7 


j«o 


opuidi ^>viVl Jc \J\j\. 




Kidney Pool 


0.0 


Adrenal Gland 


3.2 


Fetal Kidney 


0.7 


Pituitary gland Pool 


0.6 


Renal ca. 786-0 


5.3 


Salivary Gland 


2.5 


Renal ca. A498 


4.0 


Phyroid (female) 


25.3 


Renal ca. ACHN 


3.0 


Pancreatic ca. CAPAN2 


5.7 


Renal ca. UO-31 


15.2 


Pancreas Pool 


3.0 
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Table ANR HASS Panel vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag2052, 
Run 

247736616 


Rel. 
Exp.(% 
Ag2052, 
Run 

248455625 


Tissue Name 


Rel. 

Exp.(%) 
Ag2052, 
Run 

247736616 


Rel. 

Exp.(%) 
Ag2052, 
Run 

248455625 


MLr-/ LI 


1 o *c 

lz.o 


7.1 


U87-MGF1 (B) 


4U.3 


22.4 


MCF-7 C2 


12.7 


8.6 


U87-MGF2 


11.1 


6.7 


MCF-7 C3 


10.2 


5.6 


U87-MGF3 


12.2 


8.0 


MCF-7 C4 


16.2 


19.5 


U87-MGF4 


27.0 


17.8 


MCF-7 C5 


13.2 


11.0 [U87-MGF5 


59.0 


38.2 


MCF-7 Co 


13.2 


14.6 |U87-MGF6 


61.1 


44.4 


MCF-7 C7 


12.7 


10.4 JU87-MGF7 


72.7 


50.7 


MCF-7 C9 


9.7 


12.9 


U87-MGF8 


75.3 


54.7 


MCF-7 CIO 


15.8 


17.1 


U87-MGF9 


29.9 


28.1 


MCF-7 CI 1 


2.5 


1.8 


U87-MGF10 


65.1 


50.0 


MCF-7 Cl2 


9.9 


8.0 


U87-MGF11 


58.2 


48.3 


MCF-7 Cl3 


12.5 


17.1 


U87-MGF12 


47.0 


42.6 


MCF-7 C15 


5.6 


6.5 


U87-MGF13 


95.3 


77.9 


MCF-7 C16 


14.0 


21.5 


U87-MGF14 


96.6 


80.1 


MCF-7 C17 


10.2 


6.9 


U87-MGF15 


64.6 


54.7 


T24D1 


25.0 


14.4 


U87-MGF16 


51.8 


47.6 


T24D2 


33.0 


42.0 


U87-MGF17 


62.0 


49.0 


T24 D3 


29.3 


19.1 


LnCAPAl 


9.4 


6.0 


T24D4 


39.8 


30.6 


LnCAP A2 


8.1 


5.5 


T24D5 


28.5 


19.5 


LnCAP A3 


6.3 


3.4 


r24D6 


32.8 


27.2 


LnCAP A4 


10.4 


6.9 


rp^ J TAT 

T24 D7 j 


18.3 


25.9 


LnCAP A5 


10.0 


6.0 


rp/"> /i T"\n 

T24D9 


12.1 


8.5 


LnCAP A6 


10.0 


6.3 


T24 D10 


23.5 


19.2 


LnCAP A7 


9.2 


6.6 


i-pO A T*V1 1 

124 Dll 


13.2 


11.7 


LnCAP A8 


11.5 


8.8 






19.2 


LnCAP A9 


10.8 


7.2 


T24D13 


8.5 


5.8 


LnCAP A10 


11.0 


8.0 


T24D15 


10.7 


8.0 


LnCAP All 


15.7 


10.7 


T24D16 


6.6 


4.7 


LnCAP A12 


3.5 


2.3 


T24D17 


12.0 


7.4 


LnCAP A13 


5.7 


3.3' 


CAPaNBl 


64.6 


52.1 


LnCAP A14 


3.3 


1.7 


CAPaNB2 


46.3 


33.2 


LnCAP A15 


2.5 


1.3 


CAPaNB3 


13.0 


10.7 


LnCAP A16 


12.5 


3.6 


CAPaNB4 


39.8 


30.4 


LnCAP A17 


12.2 


2.5 


CAPaNB5 


39.5 


28.7 


Primary Astrocytes 


♦7.3 27.9 
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CAPaNB6 


27.5 


25.7 


IPrimaiy Renal * 
jProximal Tubule 
Epithelial cell A2 


100.0 


.* 3.13?. 

100.0 


CAPaNB7 


30.1 


31.2 


Primary melanocytes 
A5 


40.1 


21.8 


CAPaNB8 


33.2 


26.8 


126443 - 341 medullo 


0.7 


0.4 


CAPaNB9 


38.7 


50.0 


126444 - 487 medullo 


2.2 


1.8 


CAPaNBlO 


57.4 


51.4 


126445 -425 medullo 


1.6 


l A) 


CAPaNBH 


45.1 


28.5 


126446 - 690 medullo 


4.4 


z.o 




•21 A 

31.4 


22.7 


126447 - 54 adult 
glioma 


33.4 


22.2 


CAPaNB13 


38.7 


29.7 


126448 -245 adult 
glioma 


9.4 


6.3 


CAPaNBH 


29.9 


22.1 


126449 -317 adult 
glioma 


10.4 


6.0 


CAPaNB15 


32.8 


20.7 


126450 -212 glioma 


41.5 


?.2 8 


CAPaNB16 


29.7 


16.4 


126451 -456 glioma 


17.4 


11.3 


CAPaNB17 |< 


42.3 


24.3 









Table ANG. Panel 1.3D 



Tissue Name 


ReL 

Exp.(% 
Ag2052, 
Run 

166004256 


Tissue Name 


ReL 

Exp.(%) 
Ag2052, 
Run 

166004256 


Liver adenocarcinoma 


21.8 


Kidney (fetal) 


19.2 


Pancreas 


4.2 


Renal ca. 786-0 


8.4 


Pancreatic ca. CAPAN 2 


24.5 


Renal ca. A498 


26.4 


Adrenal gland 


11.7 


Renal ca.RXF393 


34.4 


Thyroid 


37.6 


(Renal ca. ACHN 


9.3 


Salivary gland 


25.3 


jRenal ca.UO-31 


33.7 


Pituitary gland J13.8 


Renal ca.TK-10 


2.8 


Brain (fetal) 


11.7 


Liver 


14.0 


Brain (whole) 


51.4 | 


Liver (fetal) 


16.2 


Brain (amygdala) 


29.5 


Liver ca. (hepatoblast) HepG2 


33.9 


Brain (cerebellum) 


24.3 ~\ 


Lung 


22.8 


Brain (hippocampus) 


24.5 


Lung (fetal) 


10.7 


Brain (substantia nigra) 


17.8 


Lung ca. (small cell) LX-1 I 


25.2 


Brain (thalamus) 


27.5 


Lung ca. (small cell) NCI-H69 2. 1 


Cerebral Cortex 


*5.4 


wung ca. (s.cell var.) SHP-77 ( 


5.9 


Spinal cord ; 


30.4 j 


^ung ca. (large cell)NCI-H460 \ 


U 


glio/astro U87-MG 1 


*2.6 1 


-ung ca. (non-sin. cell) A549 < 


U 


gho/astroU-118-MG 2 


13.5 1 


.ung ca. (non~s.cell) NCI-H23 A 


u 
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lastrocytoma SW1783 


24.3 . jLung ca. (r&Qdfe M?Mz li 




Jneuro*; met SK-N-AS 


5.4 


Lung ca. (non-s.cl) NCI-H522 


3.4 


lastrocytoma SF-539 


43.8 


Lung ca. (squam.) SW 900 


18.4 


lastrocytoma SNB-75 


21.9 


Lung ca. (squam.) NCI-H596 


1.9 


Iglioma SNB-19 


20.7 


Mammary gland 


15.5 


glioma U251 


43.2 


Breast ca * (pl.ef) MCF-7 


10.7 


glioma SF-295 


25.5 


Breast ca * (pl.ef) MDA-MB-231 


13.2 


Heart (fetal) 


15.2 


Breast ca * (pl.ef) T47D 


6.0 


Heart 


13.7 


Breast ca. BT-549 


100.0 


Skeletal muscle (fetal) 


8.2 |Breastca.MDA-N 


3.7 


Skeletal muscle 


11.8 |6vary 


23.5 


Bone marrow 


19.5 Ovarian ca. OVCAR-3 


14.1 


Thymus 


7.7 fOvarian ca. OVCAR-4 


20.7 


Spleen 


'34.6 


Ovarian ca. OVCAR-5 


23.5 


Lymph node 


17.4 


Ovarian ca. OVCAR-8 


7.8 


Colorectal 


12.5 


Ovarian ca. IGROV-1 


5.1 


Stomach 


8.0 


Ovarian ca.* (ascites) SK-OV-3 |27.9 


Small intestine 


12.2' 


Uterus 


11.0 


Colon ca. SW480 


9.7 


Placenta 


40.3 


Colon ca * SW620(SW480 met) 


5.9 


Prostate | 


8.0 


Colon ca. HT29 


1.2 


Prostate ca.* (bone met)PC-3 


8.4 


Colon ca.HCT-1 16 


4.8 


Testis 


4.3 


Colon ca. CaCo-2 


15.7 


Melanoma Hs688(A).T 


22.7 


Colon ca. tissue(OD03866) 


52.4 jMelanoma* (met) Hs688(B).T 


21.8 i 


Colon ca.HCC-2998 


12.9 |. 


Melanoma UACC-62 : 


23.0 


Gastric ca.* (liver met) NCL-N87 


21.9 t 


VIelanoma M14 c 


13.2 


Bladder 


11.4 j] 


Melanoma LOX IMVI 


11.2 


Trachea 


13.1 |1 


Vtelanoma* (met) SK-MEL-5 : 


>2.8 


Kidney ; 


51.0 L 


\dipose J 


.2.8 



Table ANH. Panel 2.2 



Tissue Name 


ReL 
Exp. %) 
Ag2052, 
Run 

174244470 


Tissue Name 


Rel. 

Exp.(%) 
Ag2052, 
Run 

174244470 


Normal Colon 


3.3 


Kidney Margin (OD04348) 


13.1 


Colon cancer (OD06064) 


23.3 


Kidney malignant cancer 
(OD06204B) 


1.0 


Colon Margin (OD06064) 


3.6 


Kidney normal adjacent tissue 
(OD06204E) 


9.5 


Colon cancer (OD06159) 


1.5 


Kidney Cancer (OD04450-01) 


22.2 
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Colon Margin (OD06159) 


3.6 


Kidney Mlrgnf (StibUfeAfr* 




Colon cancer (OD06297-04) 


1.3 


Kidney Cancer 8120613 


0.6 


Colon Margin (OD06297-05) 


4.7 


Kidney Margin 8120614 


0.0 


CC Gr.2 ascend colon (OD03921) 


1.5 


Kidney Cancer 9010320 


10.7 


CC Margin (OD03921) 


2.6 


Kidney Margin 9010321 


6.6 


Colon cancer metastasis 
(OD06104) 


6.7 


Kidnev Cancer 8120607 


97 


Lung Margin (OD06104) 


6.0 


Kidney Margin 8120608 


11.4 


Colon mets to lung (OD04451-01) 


12.8 


Normal Uterus 


3.1 


Lung Margin (OD04451-02) 


6.0 


Uterine Cancer 064011 


3.5 


Normal Prostate 


2.3 


Normal Thyroid 


7.2 


Prostate Cancer (OD04410) 


0.7 


Thyroid Cancer 064010 


44.8 


Prostate Margin (OD04410) 


1.2 


Thyroid Cancer A302152 


100.0 


Normal Ovary 


6.1 


Thyroid Margin A302153 


7.6 


Ovarian cancer (OD06283-03) 


4.1 


Normal Breast 


2.2 


Ovarian Margin (OD06283-07) 


2.0 


Breast Cancer (OD04566) 


2.5 


Ovarian Cancer 064008 


9.2 


Breast Cancer 1024 


6.3 


Ovarian cancer (OD06145) 


8.9 


Breast Cancer (OD04590-01) 


8.5 


Ovarian Margin (OD06145) 


3.8 


Breast Canr^r 1\4Wq 
(OD04590-03) 


4.4 


Ovarian cancer (OD06455-03) 


6.1 


Breast Cancer Metastasis 
(OD04655-05) 


3.3 


Ovarian Margin (OD06455-07) 


1.0 


Breast Cancer 064006 


4.9 j 


Normal Lung 


4.9 


Breast Cancer 9100266 


2.7 


Invasive poor diff. lung adeno 
(ODO4945-01 


2.9 


Breast Margin 9100265 


1.7 


Lung Margin (ODO4945-03) 


3.2 


Breast Cancer A209073 


1.5 


Lung Malignant Cancer 
(OD03126) 


in 


Breast Margin A2090734 


2.3 


Lung Margin (OD03126) \ 


5.1 


Breast cancer (OD06083) 


4.4 


Lung Cancer (OD05014A) 


19.6 


Breast cancer node metastasis 
(OD06083) 




Lung Margin (OD05014B) 


15.3 


Normal Liver 


6.9 


Lung cancer (OD06081) 


3.4 


Liver Cancer 1026 


8.0 


Lung Margin (OD06081) 


1.3 


Liver Cancer 1025 


22.2 


Lung Cancer (OD04237-01) |4.6 ] 


Liver Cancer 6004-T 


13.8 


Lung Margin (OD04237-02) |l 1.1 3 


Liver Tissue 6004-N 


U 


Ocular Melanoma Metastasis ; 


J.5 ] 


Liver Cancer 6005-T 


11.5 


Ocular Melanoma Margin (Liver) J 


).8 ] 


Jver Tissue 6005-N 


51.1 


Melanoma Metastasis i 


5.4 ] 


4ver Cancer 064003 3 


L3.6 


Melanoma Margin (Lung) i 


5.1 I 


formal Bladder 5 


>.8 


Normal Kidney 2 


S3 I 


Bladder Cancer 1023 'A 


k8 


Kidney Ca, Nuclear grade 2 
(OD04338) 


i.O I 


Bladder Cancer A302173 t 


i.l 


Kidney Margin (OD04338) 3 


0.6 1 


formal Stomach 5 


.3 
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Kidnev Ca Nuclear Prade* 1 /2 
(OD04339) 


15.0 


Gastric Cancer 9060397 * 


7 "4 4 u 7" 


-I 1 


Kidney Margin (OD04339) 


11.3 


Stomach Margin 9060396 


5.0 




Kidney Ca, Clear cell type 
(OD04340) 


4.2 


Gastric Cancer 9060395 


4.6 




Kidney Margin (OD04340) 


7.2 


Stomach Margin 9060394 


7.7 




Kidney Ca, Nuclear grade 3 
(OD04348) 


3.1 


Gastric Cancer 064005 


3.8 





Table ANI. Panel 4.1D 



Tissue Name 


ReL 
Exp.(% 
Ag5278, 
Run 

JLYUJL1 

230472911 


Tissue Name 


ReL 

Exp.(%) 
Ag5278, 
Run 

230472911 


Secondary Thl act 


3.4 


HUVEC IL-lbeta 


13.0 


Secondary Th2 act 


3.3 


HUVEC IFN gamma 


9.0 


Secondary Trl act 


1.2 


HUVEC TNF alpha + IFN gamma 


7.4 


Secondary Thl rest 


0.4 


HTTVFr 1 TNTJ alnha a. IT A 


1. 1 


Secondary Th2rest 


0.0 


HUVEC IL-11 


3.6 


secondary iri rest 


A A 

0.0 


Lung Microvascular EC none 


27.7 


Primary Thl act 

— i — 


0.0 


Lung Microvascular EC TNFalpha 
+ DL-lbeta 


8.2 


Primary Th2 act 


i.i 


Microvascular Dermal EC none 


4.2 


Primary Trl act 


1.4 


Microsvasular Dermal EC 
TNFalpha* IL-lbeta 


3.0 


Primary Thl rest 


0.5 


Bronchial epithelium TNFalpha + 
ELlbeta 


9.1 


Primary Th2 rest 


0.5 


Small airway epithelium none 


22.1 


Primary Trl rest 


0.9 


Small airway epithelium TNFalpha 
+ IL-lbeta 


33.9 


CD45RA CD4 lymphocyte act 


5.0 


Coronery artery SMC rest 


6.2 


CD45RO CD4 lymphocyte act 


1.6 


Coronery artery SMC TNFalpha + 
IL-lbeta 


11.3 


CD8 lymphocyte act 


0.4 


Astrocytes rest 


2.3 


Secondary CD8 lymphocyte rest 


1.3 


Astrocytes TNFalpha + DL-lbeta 


3.1 


Secondary CD8 lymphocyte act 


0.0 


KU-812 (Basophil) rest 


1.9 


CD4 lymphocyte none 


3.0 


KU-812 (Basophil) 
PMA/ionomycin 


10.9 


2ry Thl/Th2/Trl_anti-CD95 
CH11 


10 


"CD1106 (Keratinocytes) none 


5.8 


LAK cells rest 


8.6. ! 


XD1106 (Keratinocytes) 
rNFalpha + IL-lbeta 


*.8 


LAK cells IL-2 ( 


).6 I 


-iver cirrhosis 3 


1.9 
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LAK cells IL-2+DL-12 


0.0 


" NCI-H292 iff T • O S O S 


A tit J -ft "7 


LAK cells IL-2+IFN gamma 


0.7 


NCI-H292 TL-4 

AiV/J, IJ^y^ AL^"""+ 


0 A 

8.4 


LAK cells IL-2+ IL-18 


0.9 


NCI-H292IL-9 


7.0 


1 , ATT **p11q T^MT A/i/Ynr\m\/r»i-ri 

cciia x lYLrv lunomyciri 




LNCI-H292 IL-13 


5.6 j 


NK Celk IT -? re*t 




IXT/^T TT^rVO TTT7KT 

JNCI-H292 IFN gamma 




Two Wav MT P 1 Hs»v 


9.4 


fHPAEC none 


9.1 


Two Wav MT 1? 5 r?n v 
a. yy*j way XYULiXx. J Uay 


3.9 


JHPAEC TNF alpha + IL-1 beta 


28.3 


Two Wav MT R 7 dav 


2.3 

„ 


[Lung fibroblast none 


9.3 


PBMC rest 


0.6 


jLung fibroblast TNF alpha + IL-1 
Jbeta 


12.2 


A AJlVll^ r YV 1VJL 


1.1 


|Lung fibroblast IL-4 


3.9 


PBMC PHA-L 


2.2 (Lung fibroblast IL-9 


11.8 


Ramos (B cell) none 


0.0 jLung fibroblast IL-13 


5.4 


Ramos (B cell) ionomycin 


0.0 jLung fibroblast IFN gamma 


19.5 


B lymphocytes PWM 


0.0 jDermal fibroblast CCD1070 rest 


32.1 


B lymphocytes CD40L and EL-4 


1.4 


Dermal fibroblast CCD 1070 TNF 
alpha 


OO.U 


EOL-1 dbcAMP 


1.4 


Dermal fibroblast CCD1070 IL-1 
beta 


21.8 


EOL-1 dbcAMP 
PMA/ionomycin 


1.4 


Dermal fibroblast IFN gamma 


42.3 


Dendritic cells none 


100.0 


Dermal fibroblast IL-4 


45 J 


Dendritic cells LPS 


34.9 


Dermal Fibroblasts rest 


15.7 


Dendritic cells anti-CD40 (44. 8 


Neutrophils TNFa+LPS 


0.0 


Monocytes rest 


1.4 


Neutrophils rest 


3.6 


Monocytes LPS 


19.9 


Colon ( 


3.0 


Macrophages rest 


12.5 


Lung 


I.4 


Macrophages LPS 


L1.2 


rhymus ( 


).0 


HUVEC none ; 


)-9 1 


<idney j 


[2.8 


HUVEC starved ] 


11.7 


* 





Table AN.T. Panel 4T> 



Tissue Name 


Rel. 

Exp.O 

Ag2052, 

Run 

161706487 


Tissue Name 


Rel. 

Exp.(%) 
Ag2052, 
Run 

161706487 


Secondary Thl act 


2.6 


HUVEC IL-lbeta 


2.1 


Secondary Th2 act 


1.7 


HUVEC IFN gamma 


5.2 


Secondary Trl act 


1.9 


HUVEC TNF alpha + IFN gamma 


5.7 


Secondary Thl rest 


6.3 


HUVEC TNF alpha + IL4 


4.5 


Secondary Th2 rest 


0.5 


HUVEC IL-11 


2.6 


Secondary Trl rest 


0.6 


Lung Microvascular EC none 


9.9 
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Primary Thl act 


1.4 


Lung MicrMieuMe^SPa 
+ IL-lbeta 


110.0 


Primary Th2 act 


|0.7 


Microvascular Dermal EC none 


J16-6 


Primary Trl act . 


1.2 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


o o 
y.JL 


Primary Thl rest 


2.2 


Bronchial epithelium TNFalpha + 
ILlbeta 


3.1 


Primary Th2 rest 


_ L£ 


_j Small airway epithelium none 


12.5 ~\ 


Primary Trl rest 


0.2 


Small airway epithelium TNFalpha 
|+ IL-lbeta 


46.0 


CD45RA CD4 lymphocyte act 


4.2 


jCoronery artery SMC rest 


5.4 "1 


CD45RO CD4 lymphocy te act 


1.4 


jCoronery artery SMC TNFalpha + 
|IL-lbeta 


4.3 j 


CD8 lymphocyte act 


0.3 


[Astrocytes rest 


2.2 


Secondary CD8 lymphocyte rest 


1.4 


{Astrocytes TNFalpha + IL-lbeta 


2.0 1 


Secondary CD8 lymphocyte act 


0.4 


KU-812 (Basophil) rest 


1.5 


CD4 lymphocyte none 


0.4 


KU-812 (Basophil) 
jPMA/ionomycin 


11.0 j 


2ry Thl/Th2/TrLanti-CD95 


0.8 


jCCDl 106 (Keratinocvtes} none 
1 °cytes;none 


% 1 1 


LAK cells rest 


43.2 


CCD1106 (Keratinocytes) 
iivraipna + beta 


0.8 


LAK cells JL-2 


0.8 


Liver cirrhosis 


1.5 


LAK cells IL-2+1L-12 




Lupus kidney 


0.7 ] 


LAK cells JL-2+IFN gamma 


3.2 


MPT W9Q9 ™r»o 

iNv^i-jtizyz none 


5.8 1 


LAK cells IL-2+ IL-1 8 


2.1 


i>^i xizyz JLL-4 


5.5 ] 


LAK cells PMA/ionomycin 


26.2 


NCI-H292 EL-9 


7.4 | 


NK Cells TT -9 rest 

1 i IV v^v>lia JUL/ JU J.Col 


0.3 


NCI-H292 EL- 13 


2.7 


Twn Wav MT R 3 do™ 

x yvvj tt ay iVxJUJv j Udy 


9.2 


NCI-H292 IFN gamma 


3.3 j 


Two Wav ]VTT "R ^ rfav 




HPAEC none 1 


5.6 _J 


Two Wav MT R 7 Hav J 


z.u 


HPAEC TNF alpha + EL-1 beta 


10.7 j 


PBMC rest 


1 A 

1-0 


Lung fibroblast none 


5.3 j 


PBMCPWM 


5.3 


Lung fibroblast TNF alpha + IL-1 
beta * 


5.3 


PBMC PHA-L 


)-U 


-ung fibroblast EL-4 ] 


.6.4 1 


Ramos (B cell) none ( 


).0 


Lung fibroblast 31-9 \ 


1.1 | 


Ramos (B cell) ionomycin ( 


).0 I 


^ung fibroblast EL-13 5 


& J 


B lymphocytes PWM 1 


1.2 I 


_ung fibroblast IFN gainrria ^ 1 


5a ~1 


B lymphocytes CD40L and IL-4 ] 


.2 I 


)ermal fibroblast CCD1070 rest 1 


5.5 


EOL-1 dbcAMP C 


17 L 
a 


dermal riDroDiast LAJJIU/O TNF 
lpha 1 


8.9 


EOL-1 dbcAMP 
PMA/ionomycin 


£ 


)ermal fibroblast CCD1070 IL-1 
eta * 


1.1 | 


Dendritic cells none 6 


6.9 E 


termal fibroblast IFN gamma l! 


J.6 j 


Dendritic cells LPS 3 


7.6 E 


termal fibroblast DL-4 2 


1.2 1 


Dendritic cells anti-CD40 7 


7.9 U 


3D Colitis 2 q. 


2 ] 
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Monocytes rest 


5.1 


IBD Crohn'PC .11 / USOSV 


Ol'i) — «• 


Monocytes LPS 


17.2 


Colon 


3.9 


Macrophages rest 


100.0 


Lung 


19.8 


Macrophages LPS 


40.9 


Thymus 


12.9 


HUVEC none 


5.3 


Kidney 


2.4 


HUVEC starved 


10.6 







Table ANK. Panel 5 Islet 



Tissue Namp 


Rel. 

Exp.(%) 
un 

279370795 


Ti ecu a Kfciino 

libbiic iodine 


Rel. 

Exp.(%) 
Run 

279370795 


97457_Patient-02go_adipose 


15.6 


94709 JDonor 2 AM - A_adipose 


24.7 


97476JPatient-07sk_skeletal 
muscle 


0.0 


94710J)onor 2 AM - B_adipose 


24.7 


97477 JPatient-07uLuteriis 


22.1 


9471 1 JDonor 2 AM - Cjidipose 


14.7 


97478_Patient-07pl_placenta 


13.1 


94712_Donor 2 AD - A_adipose 


64.2 


90167 "R a w>r Patient 1 


17 


Q/171 ^ Tlrvn<-vr 0 AT\ T> ^ T ^»«~v~~ 
;^ ^ ^/l j_^xjouOx l i\u - -D_aaipose 


o9.5 


97482 Patient-08nt uterus 


J. _J. J 


QJ/714 Dnnnr 0 AT* C ir\\r\r\cc% 
y*+/ x^_jjvLi\jx a - v^.__auipose 




97483J?atient-08pljplacenta 


11.6 


94742 JDonor 3 U - A_Mesenchymal 

Cfprri Pf»11p 
OtClll 


17.3 . 


97486 J?atient-09sk_skeletal 
muscle 


4.8 


94743 JDonor 3 U - B JVlesenchymal 
Stern Cell* 


23.2 


97487 J?atient-09ut_uterus 


15.5 


94730 J)onor 3 AM - A_adipose 


54.0 


97488 Patient-09nl nlaepnta 


7 Q 

/ -.7 


/ d jL^j-zoiior j rWri - .D__aGipose 


/O.J 


97492 JPatient-10ut_uterus 


14.5 


94732 J)onor 3 AM - C_adipose 


59.9 


97493 JPatient-lOpLplacenta 


23.8 


94733 JDonor 3 AD - A.adipose 


100.0 


97495_Patient-l 1 go_adipose 


11.9 


94734 JDonor 3 AD - B_adipose 


92.0 


97496 JPatient-1 lsk_skeletal 
muscle 


3.2 | 


94735 J)onor 3 AD - C_adipose 


32.1 


97497 JPatient-1 lut_uterus 


36.9 


77 1 38 JLi ver_HepG2untreated 


62.9 


97498 J>atient-1 lpLplacenta 


7.0 


73556_Heart_Cardiac stromal cells 
(primary) 


0.3 


97500JPatient-12go_adipose 


17.2 


81735_Small Intestine 


10.9 


97501JPatient-12sk_skeletal 
muscle 


8.4 


72409 JKjdneyJProximal Convoluted 
Tubule 


23.7 


97502JPatient-12ut_uterus 


25.2 


82685 JJmall intestineJDuodenurn 


9.3 


97503JPatienM2pLplacenta 


23.8 


90650_Adrenal_AdrenocorticaI 
adenoma 


8.4 


94721_Donor2U- 
AJMtesenchymal Stem Cells 


61.6 


72410JCidneyJHRCE 


tt).l 


94722JDonor2U- 
B_Mesenchymal Stem Cells 


*5.1 


72411JCidney_HRE 


13.5 
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94723_Donor2U- 
C_Mesenchymal Stem Cells 



53.2 



muscle cells 



AI_comprehensive panel_vl.0 Summary: Ag2052 Highest expression of this 
gene is detected in synovium from an orthoarthritis (OA) patient (CT=20.3). High levels of 
expression of this gene are detected in samples derived from normal and orthoarthitis/ 
rheumatoid arthritis bone and adjacent bone, cartilage, synovium and synovial fluid 
samples, from noimal lung, COPD lung, emphysema, atopic asthma, asthma, allergy, 
Crohn's disease (norma] matched control and diseased), ulcerative colitis(normal matched 
control and diseased), and psoriasis (normal matched control and diseased). Therefore, 
therapeutic modulation of this gene product may ameliorate symptoms/conditions 
associated with autoimmune and inflammatory disorders including psoriasis, allergy, 
asthma, inflammatory bowel disease, rheumatoid arthritis and osteoarthritis. 

CNS_neurodegeneration_vl.O Summary: Ag5277/Ag5278 Expression of this 
gene is low/undetectable (CTs > 35) across all of the samples on this panel. 

General jscreening_paneLvl.5 Summary: Ag5278 Highest expression of this 
gene is detected in breast cancer BT-549 cell line (CT-29). Moderate levels of expression 
of this gene is also seen in cluster of cancer cell lines derived from pancreatic, gastric, 
colon, lung, liver, renal, breast, ovarian, melanoma and brain cancers. In addition, moderate 
to low levels of expression of this gene is also seen in all the regions of brain, in tissues 
with metabolic/endocrine functions such as pancreas, adrenal gland, thyroid, fetal liver and 
colon. Please see panel 1.3D for further discussion of this gene. 

Ag5277 Expression of this gene is low/undetectable (CTs > 35) across all of the 
samples on this panel. 

HASS Panel vl.O Summary: Ag2052 Two experiments with same probe and 
primer sets are in excellent agreement. This gene shows wide spread expression in this 
panel, with highest expression in primary renal proximal tubular epithelial cells cultured in 
vitro (CTs=20-22). The expression of this gene is also higher in the glioblastoma type of 
brain cancer compared to the medulloblastoma suggesting that it may play a role in 
glioblastoma development than medulloblastomas. Expression is also induced in the 
U87-MG( cells when they are deprived of nutrients, oxygen and exposed to an acidic pH 
than in the control population (comparing the control U87-MG F4 with U87-MG F5, F7, 
F10). This suggests that the serum-starved, hypoxic and acidotic regions of brain cancers 
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may express this gene at a higher level and that this may ft SselaS UfiQBtf tEfc& 3 7 3 
regions. 

Panel 1.3D Summary: Ag2052 This gene shows a widespread expression in this 
panel. Highest expression of this gene is detected in breast cancer BT-549 cell line 
5 (CT=24.9). High levels of expression of this gene is also seen in cluster of cancer cell lines 
derived from pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, 
melanoma and brain cancers. Thus, expression of this gene could be used as a marker to 
detect the presence of these cancers. Furthermore, therapeutic modulation of the expression 
or function of this gene may be effective in the treatment of pancreatic, gastric, colon, lung, 
10 liver, renal, breast, ovarian, prostate, melanoma and brain cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at high 
levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal muscle, heart, 
liver and the gastrointestinal tract. Therefore, therapeutic modulation of the activity of this 
gene may prove useful in the treatment of endocrine/metabolically related diseases, such as 
15 obesity and diabetes. 

In addition/this gene is expressed at high levels in all regions of the central nervous 
system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 
product may be useful in the treatment of central nervous system disorders such as 
20 Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

Panel 2.2 Summary: Ag2052 Highest expression of this gene is detected in 
thyroid cancer (CT=23.9). High to moderate levels of expression of this gene is also seen in 
normal and cancer samples derived from melanoma, colon, gastric, bladder, liver, breast, 
25 thyroid, uterine, kidney, lung, ovarian and prostate cancers. Interestingly/higher levels of 
expression of this gene is associated with kidney and thyroid cancers as compared to 
corresponding normal tissue. Therefore, expression of this gene may bay used as diagnostic 
marker to detect the presence of these cancers. Furthermore, therapeutic modulation of this 
gene may be useful in the treatment of melanoma, colon, gastric, bladder, liver, breast, 
30 thyroid, uterine, kidney, lung, ovarian and prostate cancers. 

Panel 4.1D Summary: Ag5278 Highest levels of expression of this gene is 
detected in resting dendritic cells (CT=32). Moderate to low levels of expression of this 
gene is also seen in activated dendrict cells, PMA/ionomycin stimulated LAK cells, LPS 
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activated macrophage, lung microvascular endothelial ce^ G^ef efe^SHSSS -deSI ^jmft]^ 3 
airway epithelium, and dermal fibroblasts. Therefore, therapeutic modulation of this gene 
or its protein product may alter the functions associated with these cell types and would be 
beneficial in the treatment of autoimmune and inflammatory diseases such as asthma, 
5 allergies, inflammatory bowel disease, lupus erythematosus, psoriasis, rheumatoid arthritis, 
and osteoarthritis. 

Ag5277 Expression of this gene is low/undetectable (CTs > 35) across all of the 
samples on this panel. 

Panel 4D Summary: Ag2052 Highest expression of this gene is detected in resting 
10 macrophage (CT=21). This gene is expressed at high to moderate levels in a wide range of 
cell types of significance in the immune response in health and disease. These cells include 
members of the T-cell, B-cell, dendritic cells, endothelial cell, macrophage/monocyte, and 
peripheral blood mononuclear cell family, as well as epithelial and fibroblast cell types 
from lung and skin, and normal tissues represented by colon, lung, thymus and kidney. This 
15 ubiquitous pattern of expression suggests that this gene product may be involved in 
homeostatic processes for these and other cell types and tissues. This pattern is in 
agreement with the expression profile in General jscreening_paneLvl. 3 and also suggests a 
role for the gene product in cell survival and proliferation. Therefore, modulation of the 
gene product with a functional therapeutic may lead to the alteration of functions associated 
20 with these cell types and lead to improvement of the symptoms of patients suffering from 
autoimmune and inflammatory diseases such as asthma, allergies, inflammatory bowel 
disease, lupus erythematosus, psoriasis, rheumatoid arthritis, and osteoarthritis. 

Panel 5 Islet Summary: Ag2052 Highest expression of this gene is detected in a 
differentiated adipose tissue (CT=24.4).. Moderate to high levels of expression is seen in 
25 placenta, uterus, adipose, skeletal muscle, small intestine, heart and ki dney. This gene 

shows a ubiquitous expression which correlates to the expression in panel 1.3D. Please see 
panel 1.3D for further discussion of this gene. 

AO. CG56836-04: Cathepsin B. 

Expression of gene CG56836-04 was assessed using the primer-probe set Ag5264, 
30 described in Table AO A. Results of the RTQ-PCR runs are shown in Tables AOB, AOC 
and AOD. 
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Table AOA. Probe Name Ag5264 



Primers 


* 


Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -tcctgctgggtttctggt-3 1 


18 


455 


411 


Probe 


TET-5 1 -ccgtactccatccctccctgtgagc- 
3 1 -TAMRA 


25 


503 


412 


Reverse 


5 ' -tgtttgtaggtcgggctgta-3 1 


20 


605 


413 



Table AOB.CNS netirodegeneration vl.O 



Tissue Name 


Rp] 

Exp.(%) 
Ag5264, 
Run 

230512807 


issue Name 


1M 

XVCl* 

Exp.(%) 
Ag5264, 
Run 

230512807 


AD 1 Hippo 


10.2 


Control (Path) 3 Temporal Ctx 


3.6 


A TV r\ TT* 

AD 2 Hippo 


32.5 


Control (Path) 4 Temporal Ctx 


18.4 


AD 3 Hippo 


9.3 


AD 1 Occipital Ctx 


14.7 


t\u xiippo 


"i 8 

J.O 


AJU Z UCCipitai L-DC IJVuSSing/ 


u.u 


AD 5 hippo 


94.0 


AD 3 Occipital Ctx 


7.3 


AD 6Hippo 


66.9 


AD 4 Occipital Ctx 


13.4 


Control 2 Hippo 


25.0 


AD 5 Occipital Ctx 


15.3 


Control 4 Hippo 


13.0 


AD 6 Occipital Ctx 


39.0 


Control (Path) 3 Hippo 


4.0 


Control 1 Occipital Ctx 


5.9 


AD 1 Temporal Ctx 


9.8 


Control 2 Occipital Ctx 


53.6 


AD 2 Temporal Ctx 


25.2 


Control 3 Occipital Ctx 


8.4 


AD 3 Temporal Ctx 


3.9 


Control 4 Occipital Ctx 


6.3 


AD 4 Temporal Ctx 


7.5 


Control (Path) 1 Occipital Ctx 


83.5 


AD 5 M Temporal Ctx 


74.7 


Control (Path) 2 Occipital Ctx 


6.0 


AD 5 SupTemporai Ctx 


43.8 


Control (Path) 3 Occipital Ctx 


1.7 


AD 6 Inf Temporal Ctx 


71.2 


Control (Path) 4 Occipital Ctx 


13.1 


AD 6 Sup Temporal Ctx 


41.8 


Control 1 Parietal Ctx 


2.9 


Control 1 Temporal Ctx 


5.9 


Control 2 Parietal Ctx 


30.1 


Control 2 Temporal Ctx 


45.1 


Control 3 Parietal Ctx 


12.3 


Control 3 Temporal Ctx 


12.0 


Control (Path) 1 Parietal Ctx 


100.0 


Control 4 Temporal Ctx 


6.7 


Control (Path) 2 Parietal Ctx 


12.6 


Control (Path) 1 Temporal Ctx 


47.3 


Control (Path) 3 Parietal Ctx 


2.5 


Control (Path) 2 Temporal Ctx 


15.9 


Control (Path) 4 Parietal Ctx 


44.1 
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Table AOC. General screening panel vl.5 



hp; XT 

Tissue Name 


ReL 

Exp.(%) 
Ag5264, 
Run 

1VUU 

232936651 


issue Name 


ReL 

Exp.(%) 
Ag5264, 
Run 

232936651 


Adipose 


0.7 


Renal ca.TK-10 


3.6 


Melanoma* Hs688(A)T 


19.5 


Bladder 


3.8 


Melanoma* Hs688{B).T 


9.0 


Gastric ca. (liver met.) NCI-N87 


10.2 


Melanoma* M14 


24.7 


Gastric ca.KATOm 


5.5 


Melanoma* LOXMVI 


15.6 


Colon ca. SW-948 


1.2 


Melanoma* SK-MEL-5 


9.7 


Colon ca. SW480 


7.0 J 


Squamous cell carcinoma SCC-4 


3.1 


Colon ca.* (SW480 met) SW620 


2.0 


Testis Pool 


0.4 


Colon ca. HT29 


0.6 


Prostate ca.* (bone met) PC-3 


2.0 


Colon ca.HCT-1 16 


3.1 


Prostate Pool 


0.6 


Colon ca. CaCo-2 


5.2 


Placenta 


3.7 


Colon cancer tissue 


8.6 


Uterus Pool 


0.2 


Colon ca.SWl 116 


2.4 


Ovarian ca. OVCAR-3 


6.7 


Colon ca. Colo-205 


4.1 


Ovarian ca. SK-OV-3 


7.2 


Colon ca. SW-48 


1.3 


Ovarian ca. OVCAR-4 , 


4.2 


Colon Pool 


1.2 


Ovarian ca. OVCAR-5 


6.2 


Small Intestine Pool 


0.7 


Ovarian ca. IGROV-1 


1.5 


Stomach Pool 


1.3 


Ovarian ca. OVCAR-8 


2.2 


Bone Marrow Pool 


0.7 


Ovary 


L4 


Fetal Heart 


0.5 


Breast ca. MCF-7 


2.7 


Heart Pool 


1.3 


Breast ca. MDA-MB-231 


4.9 


Lymph Node Pool 


2.2 


Breast ca. BT 549 


100.0 


Fetal Skeletal Muscle 


0.3 1 


Breast ca.T47D 


1.3 


Skeletal Muscle Pool 


1.3 


Breast ca. MDA-N 


1.1 


Spleen Pool 


1.2 


Breast Pool 


1.7 


Thymus Pool 


0.9 


Trachea 


3.0 


CNS cancer (glio/astro) U87-MG 


12.6 


Lung 


0.2 


CNS cancer (glio/astro) IM 18-MG 


9.0 


Fetal Lung > 


L6 


CNS cancer (neuropnet) SK-N-AS 


2.1 


Lungca.NCI-N417 


0.2 


CNS cancer (astro) SF-539 


7.4 


Lung ca.LX-1 


4.5 


CNS cancer (astro) SNB-75 


22.5 


Lungca. NCI-H146 


0.2 


CNS cancer (glio)SNB-19 


1.7 


Lungca. SHP-77 


1.6 


CNS cancer (glio) SF-295 


15.6 


Lung ca. A549 


4.1 


Brain (Amygdala) Pool 


1.4 


Lungca.NCI-H526 


0.2 


Brain (cerebellum) 


5.6 


Lung ca. NCI-H23 


2.2 


Brain (fetal) 


1.0 


Lung ca.NCI-H460 


1.2 


Brain (Hippocampus) Pool 


1.3 


Lung ca. HOP-62 


5.6 


Cerebral Cortex Pool 


1.6 
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X . XT/IT TTCOH 

Lung ca. NCI-H522 


1 A 

1.4 


Drain (ouDstantia nigra j jrooi 1 


2 «jK jL„rf? J* 1 


Liver 


1.7 


Brain (Thalamus) Pool 


2.1 


Jretal Liver 




Brain (whole) 


j.l 


uvei cd. iicpoz. 




initial Porrl Pool 


1 6 


Kidney Pool 


2.4 


Adrenal Gland 


2.1 


Fetal Kidney 


1.0 


Pituitary gland Pool 


0.4 


Renal ca. 786-0 


1.0 


Salivary Gland 


1.6 


Renal ca. A498 


1.7 


Thyroid (female) 


16.7 


Renal ca. ACHN 


4.0 


Pancreatic ca. CAPAN2 


5.6 


Renal ca.UO-31 


11.2 


Pancreas Pool 


2.8 



Table AOD. Panel 4.1D 



Tissue Name 


Rel. 
Exp.(% 
Ag5264, 
Run 

230472870 


Tissue Name 


Rel. 

Ag5264, 
Run 

230472870 


Secondary Thl act 


4.0 


HUVEC IL-lbeta 


9.2 


Secondary Th2 act 


3.3 


HUVEC ItIn gamma 


I.L 


Secondary Trl act 


1.2 


HUVEC TNF alpha + IFN gamma 


4.6 


Secondary Thl rest 


0.3 


HUVEC TNF alpha + BL4 


5.1 


Secondary Th2 rest 


0.2 


HUVEC IL-11 


4.5 


Secondary Trl rest 


0.2 


Lung Microvascular EC none 


32.5 


Primary Thl act 


0.5 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


10.3 


Primary Th2 act 


0.7 


Microvascular Dermal EC none 


42 


Primary Trl act 


1.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


2.8 


Primary Thl rest 


0.2 


Bronchial epithelium TNFalpha + 
ILlbeta 


11.5 


Primary Th2 rest 


0.3 


Small airway epithelium none 


15.8 


Primary Trl rest 


0.2 


Small airway epithelium TNFalpha 
+ IL-lbeta 


20.2 


CD45RA CD4 lymphocyte act 


4.6 


Coronery artery SMC rest 


6.0 


CD45RO CD4 lymphocyte act 


1.7 


Coronery artery SMC TNFalpha + 
IL-lbeta 


5.1 


CD8 lymphocyte act 


0.3 


Astrocytes rest 


1.5 


Secondary CD8 lymphocyte rest 


1.1 


Astrocytes TNFalpha + IL-lbeta 


1.9 


Secondary CD8 lymphocyte act 


0.3 


KU-812 (Basophil) rest 


1.7 


CD4 lymphocyte none 


0.1 


KU-812 (Basophil) 
PMA/ionomycin 


8.9 


2ryThl/Th2/Trl anti-CD95 
CH11 


0.8 


CCD 1106 (Keratinocytes) none 


6.8 
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LAK cells rest 


39.2 


ccdi 106 (^Iq^U^ u*t : 

TNFalpha + IL-lbeta 




T A XT' — 1 1 — TT /~» 

LAK cells JL-2 


0.6 


Liver cirrhosis 


3.8 


T k XT' 11 TT r* TT « r\ 

LAK cells IL-2+IL-12 


0.1 


NCI-H292 none 


3.6 


T A XT _ tl _ TT o . TPX T 

LAK cells IL-2+1FN gamma 


0.3 


NCI-H292 IL-4 


4.7 


LAK cells IL-2-f IL-18 


0.3 


NCI-H292DL-9 


5.4 


LAK cells PMA/ionomycin 


54.3 


NCI-H292 IL-13 


3.3 


NK Cells EL-2 rest 


0.6 


NCI-H292 IFN gamma 


2.4 


Two Way MLR 3 day 


9.0 


HPAEC none 


3.7 


Two Way MLR 5 day 


3.4 


HPAEC TNF alpha + IL-1 beta 


27.0 


Two Way MLR 7 day 


1.3 


Lung fibroblast none 


10.7 


PBMCrest 


0.4 


Lung fibroblast TNF alpha + IL-1 
beta 


10.4 


PBMCPWM 


0.7 


Lung fibroblast IL-4 


4.5 


PBMCPHA-L 


2.7 


Lung fibroblast IL-9 


8.2 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-13 


2.2 


Ramos (B cell) ionornycin 


0.0 


Lurif? fibroblast TRM j»amma 


AU.vr 


B lymphocytes PWM 


0.5 


Dermal fibroblast CmiCnft rpst 




B lymphocytes CD40L and EL-4 


1.3 


Dermal fibroblast PPT) 1070 TNF 
alpha 


16.6 


EOL-1 dbcAMP 


1.0 


Dermal fibroblast CCD 1070 IL-1 
beta 


16.7 


EOL-1 dbcAMP 
PMA/ionomycin 


0.9 


Dprmal fiKmWact 1 Ki\J era mm a 


JX.O 


Dendritic cells none JlOO.O 


Dermal fibroblast EL-4 


20.3 


Dendritic cells LPS 


31.9 


Dermal Fibroblasts rest 


14.6 


Dendritic cells antid)40 


36.3 


Neutrophils TNFa+LPS 


0.2 


Monocytes rest 


1.4 


Neutrophils rest 


0.2 


Monocytes LPS 


40.9 


Colon 


10 


Macrophages rest 


26.1 


Lung 


1.4 


Macrophages LPS 


16.7 


rhymus ( 


).2 


HUVEC none 


t.7 


Kidney < 




HUVEC starved 


5.8 







CNS__neurodegeneration_vl.O Summary: Ag5264 This panel confirms the 
expression of this gene at low levels in the brains of an independent group of individuals. 
5 However, no differential expression of this gene was detected between Alzheimer's 

diseased postmortem brains and those of non-demented controls in this experiment. Please 
see Panel L5 for a discussion of the potential utility of this gene in treatment of central 
nervous system disorders. 

General_screeningj>aneLvl.5 Summary: Ag5264 Highest expression of this 
10 gene is detected in breast cancer BT-549 cell line (CT=25). Moderate levels of expression 
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of this gene is also seen in cluster of cancer cell lines <kriffiM*fforii fanaMfie/gasW-fe,- 
colon, lung, liver, renal, breast, ovarian, prostate, melanoma and brain cancers. Thus, 
expression of this gene could be used as a marker to detect the presence of these cancers. 
Furthermore, therapeutic modulation of the expression or function of this gene may be 
5 effective in the treatment of pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, 
prostate, melanoma and brain cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at 
moderate levels in pancreas, adipose, adrenal gland, thyroid, pituitary gland, skeletal 
muscle, heart, liver arid the gastrointestinal tract. Therefore", therapeutic modulation of the 

10 activity of this gene may prove useful in the treatment of endocrine/metabolically related 
diseases, such as obesity and diabetes. 

In addition, this gene is expressed at moderate levels in all regions of the central 
nervous system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Therefore, therapeutic modulation of this gene 

15 product may be useful in the treatment of central nervous system disorders such as 

Alzheimer's disease, Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and 
depression. 

Panel 4.1D Summary: Ag5264 Highest levels of expression of this gene is 
detected in resting dendritic cells (CT=28.7). Moderate to low levels of expression of this 

20 gene is also seen in activated dendritic cells, resting and PMA/ionomycin stimulated LAK 
cells, monocytes, macrophage, different types of endothelial cells, small airway epithelium, 
lung and dermal fibroblasts and normal tissue represent by lung and kidney. This gene is 
upregulated in LPS treated monocytes, cytokine treated HPAEC, and activated secondary 
Thl, Th2 cells. Therefore, therapeutic modulation of this gene or its protein product may 

25 alter the functions associated with these cell types and would be beneficial in the treatment 
of autoimmune and inflammatory diseases such as asthma, allergies, inflammatory bowel 
disease, lupus erythematosus, psoriasis, rheumatoid arthritis, and osteoarthritis. 

AP. CG57284-03: RAS-RELATED PROTEIN RAB-5C. 

Expression of gene CG57284-03 was assessed using the primer-probe set Ag6892, 
30 described in Table APA. Results of the RTQ-PCR runs are shown in Tables APB and APC. 
Please note that this sequence represents a full-length physical clone. 
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Table APA. Probe Name Ag6892 



Primers 




Length . 


Start 
Position 


SEQED 
No 


Forward 


5 1 -gtgtcatccaggcagacagtct-3 ' 


22 


473 


414 . 


Probe 


TET-5 ' -ccgctccaattgtgctctcctggtac 
t-3 1 -TAMRA 


27 


507 


415 


Reverse 


5 » -cgctttgtcaagggacagttt-3 1 


21 


538 


416 



Table APB. General screening panel vl.6 



Tissue Name 


Rel. 

tiiXp.\ /o) 

Ag6892, 
Run 

278388295 


issue Name 


Rel. 

Ag6892, 
Run 

278388295 


Adipose 


11.0 


Renal ca. TK-10 


41.5 


Melanoma* Hs688(A).T 


37.4 


Bladder 


19.1 


Melanoma* Hs688(B).T 


33.0 


Gastric ca. (liver met.) NCI-N87 


26.4 


Melanoma* M14 


85.3 


Gastric ca. KATO m 


93.3 


Melanoma* LOXIMVI 


48.6 


Colon ca. SW-948 


15.7 


Melanoma* SK-MEL-5 


49.7 


Colon ca. SW480 


62.4 


Squamous cell carcinoma SCCM 


28.5 


Colon ca* (SW480 met) SW620 


9.5 | 


Testis Pool 


10.1 


Colon ca. HT29 


20.7 


Prostate ca.* (bone met) PC-3 


0.0 


Colon ca.HCT-1 16 


48.0 


Prostate Pool 


10.6 


Colon ca. CaCo-2 


49.7 


Placenta 


22.4 


Colon cancer tissue 


19.3 | 


Uterus Pool 


4.8 


Colon ca.SWl 116 




Ovarian ca. OVCAR-3 


18.9 


Colon ca. CoIo-205 


13.3 


Ovarian ca. SK-OV-3 


63.3 


Colon ca. SW-48 


16.5 


Ovarian ca. O VCAR-4 


17.4 


Colon Pool 


15.5 


Ovarian ca. OVCAR-5 


41.5 


Small Intestine Pool 


8.7 


Ovarian ca. IGROV-1 


18.4 


Stomach Pool 


8.0 


Ovarian ca. OVCAR-8 


13.8 


Bone Marrow Pool 


8.5 


Ovary 


10.6 


Fetal Heart 


5.9 


Breast ca. MCF-7 


33.2 


HeartPooi 


6.3 


Breast ca. MDA-MB-23 1 


46.0 


Lymph Node Pool 


16.4 


Breast ca. BT 549 


37.4 


Fetal Skeletal Muscle 


5.4 


Breast ca,T47D 


35.1 


Skeletal Muscle Pool 


1.6 


Breast ca. MDA-N 


22.2 


Spleen Pool 


8.8 


Breast Pool 


12.7 


Thymus Pool 


8.7 


Trachea 


12.0 


CNS cancer (glio/astro) U87-MG 


35.4 


Lung 


2.5 


CNS cancer (glio/astro) U-118-MG 


55.9 
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"Pptal 7 una 




CNS cancer &e^I^tM§!rl&#- 


sis?: 

JZI. X 


Lunf»ca NPT-N417 


5 4 


CNS rancer f astro) SF-539 




Lunj? ca LX-1 


20.2 


CNS cancer (astro) SNB-75 


52.9 


Lunsca NCI-H146 


8.6 


CNS cancer felio) SNB-19 


21 2 


Lunffca SHP-77 


20 2 


CNS cancer felio) SF-295 


too 0 


Lung ca A549 


51 1 


Brain ^Anivffdala^ Pool 


10 6 


Lun^ca NCI-H526 

UUUg vfl. X 1 V_/J. XU £->\J 


5 6 


Rrain ^cer&hpllnm^ 


49 0 


Lun^ca NCI-H23 


T\ 7 


"Rram Cretan 

A->1 dill yi^LUi y 


2S 0 


Lunfrca NCI-H460 


to 1 


XJlOXll ^XXXl/iyL/LiCUXXL/Uoy X \JU1 




Lunp-ca HOP-62 

wUUg Vdi XXV/X \J£-» 


bio 


Cprphral CrvrtP'X' Pnr»l 


17 3 


Timer ra NCT-HS?? 


1^1 4 


JOIalil ^OUUMailUa 111 x UU1 


X I -jC 


T .ivpt 

JLdVwl 


5 7 


Hr»in rThalarrm^ Print 

J-HaXll \± Aiaxaxxxuoy X Ovji 


x y*\j 


X CLaX XjI YC1 


10 8 


Rrain ^wtinTpi 

Xvl aAll ^ / 


23 0 


Liver ca. HepG2 


10.3 


Spinal Cord Pool 


12.5 


Kidney Pool 


15.9 


Adrenal Gland 


24.8 


Fetal Kidney 


14.0 


Pituitary gland Pool 


2.7 


Renal ca. 786-0 


24.3 


Salivary Gland 


11.3 


Renal ca. A498 


21.9 


Thyroid (female) 


9.8 


Renal ca. ACHN 


22.2 


Pancreatic ca. CAPAN2 


24.8 


Renal ca.UO-31 


35.4 


Pancreas Pool 1 


8.1 



Table APC. Panel 5 Islet 



Tissue Name 


Rcl. 

Exp.(%) 
Ag6892, 
Run 

305424859 


Tissue Name 


Rel. 

Exp.(%) 
Ag6892, 
Run 

305424859 


97457_Patient-02go_adipose 


4.5 


94709JDonor 2 AM - A_adipose 


44.1 


97476JPatient-07sk_skeletal 
muscle 


0.0 


94710_Donor 2 AM - B_adipose 


30.8 


97477 J^tient^uUiterus 


8.2 


9471 1 JDonor 2 AM - C__adipose 


21.0 


97478_Patient-07pI_placenta 


13.1 


94712JDonor 2 AD - A^adipose 


48.0 


99167 JBayer Patient 1 


23.2 


94713_Donor 2 AD - B_adipose 


54.0 


97482JPatient-08ut_utenis 


7.7 


94714_Donor 2 AD - Qadipose 


50.3 


97483 JPatient-08pLplacenta 


18.9 


94742 JDonor 3 U - AJVIesenchymal 
Stem Cells 


14.7 


97486JPatient-09sk_skeletal 
muscle 


4.4 


94743 _Donor 3 U - B Jylesenchymal 
Stem Cells 


10.4 


97487 JPatient-09ut_uterus 


19.6 


94730_Donor 3 AM - A^adipose 


53.2 


97488JPatient-09pLplacenta 


11.3 


9473 1 J>onor 3 AM - B_adipose 


74.2 


97492 JPatient- 1 0ut_uterus 


12.2 


94732JDonor 3 AM - C_adipose 


58.6 


97493 J'atient-lOpLplacenta 


34.9 


94733J)onor 3 AD - A^adipose 


64.6 


97495 JPatient-1 lgo_adipose 


9.2 


94734JDonor 3 AD - B_adipose 


100.0 
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97496_Patient-l lsk_skeletal 
muscle 




PCT/ UHUrf-/ 

y4 / j>z>_x^onor :> >\u - v^__aaipose 


20.4 


97497JPatient-l lut_uterus 


25.0 


77 138JLiver_HepG2untreated 


71.2 


97498 JPatient-llpLplacenta 


8.8 


73556 JIeait_Cardiac stromal cells 
(primary) 


in ^ 

15.6 


97500 J>atient-12go_adipose 


10.4 


81735_Small Intestine 


12.4 


97501 JPatient-12sk_skeletal 
muscle 


12.7 


72409 JKidneyJProximal Convoluted 
Tubule 


81.2 


97502JPatient-12ut_uterus 


18.9 


82685 JSmall intestineJDuodenum 


8.1 


97503 Patient-12ol Dlacenta 


17 8 


90650_AdrenaLAdrenocortical 
adenoma 


4.8 


94721_Donor2U- 
AJVTesenchymal Stem Cells 


27.9 


72410JGdneyJHRCE j 


37.9 


94722_Donor2U- 

B JMfesenchymal Stem Cells 


25.7 


72411JKidneyJHRE 


18.8 


94723_Donor2U- 
C3fesenchymal Stem Cells 


30.4 


73 1 39_Uterus_JJterine smooth 
muscle cells 


48.0 



GeneraLscreening_panel_vl.6 Summary: Ag6892 Highest expression of this 
gene is seen in a brain cancer cell line (CT=24.1). This gene is ubiquitously expressed in 
this panel, with high levels of expression seen in brain, colon, gastric, lung, breast, ovarian, 
and melanoma cancer cell lines. This expression profile suggests a role for this gene 
product in cell survival and proliferation. Modulation of this gene product may be useful in 
the treatment of cancer 

Among tissues with metabolic function, this gene is expressed at high levels in 
pituitary, adipose, adrenal gland, pancreas, thyroid, and adult and fetal skeletal muscle, 
heart, and liver. This widespread expression among these tissues suggests that this gene 
product may play a role in normal neuroendocrine and metabolic function and that 
disregulated expression of this gene may contribute to neuroendocrine disorders or 
metabolic diseases, such as obesity and diabetes. 

This gene is also expressed at high levels in the CNS, including the hippocampus, 
thalamus, substantia nigra, amygdala, cerebellum and cerebral cortex. Therefore, 
therapeutic modulation of the expression or function of this gene may be useful in the 
treatment of neurologic disorders, such as Alzheimer's disease, Parkinson's disease, 
schizophrenia, multiple sclerosis, stroke and epilepsy. 

In addition, this gene is expressed at much higher levels in fetal lung tissue 
(CT=25.7) when compared to expression in the adult counterpart (CT=29.4). Thus, 
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expression of this gene may be used to differentiate betw£efHii£ fet^aiiff-Mfilf so&ntte-Sf 
this tissue. 

Panel 5 Islet Summary: Ag6892 Highest expression is seen in adipose (CT=26), 
with nearly ubiquitous expression seen across the samples on this panel. High to moderate 
5 levels of expression are seen in metabolic tissues, including skeletal muscle, adipose, and 
placenta, in agreement with Panel 1.6. Please see that panel for discussion of this gene in 
metabolic disease. 

AQ. CG57308-02: Sulfonylurea Receptor 1 Splice Variant. 

Expression of gene CG57308-Q2 was assessed using the primer-probe set Ag7558, 
10 described in Table AQA. Results of the RTQ-PCR runs are shown in Tables AQB and 
AQC. 

Table AQA. Probe Name Ag7558 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -tcgaagggcacatcatca-3 ' 


18 


4319 


417 


Probe 


TET-5 1 -tgcctctgtccctggctgaaattctc 
-3 ' -TAMRA 


26 


4348 


418 


Reverse 


5 ' -tgaagatgctggtcttcctca-3 ' 


21 


4400 


419 



15 

Table AQB. CNS nearodegeneration vl.O 



Tissue Name 


ReL 

Exp.(%) 
Ag7558, 
Run 

308750599 


issue Name 


ReL 

Exp.(%) 
Ag7558, 
Run 

308750599 


AD 1 Hippo 


4.2 


Control (Path) 3 Temporal Ctx 


3.3 


AD 2 Hippo 


16.4 


Control (Path) 4 Temporal Ctx 


50.3 


AD 3 Hippo 


1.7 


AD 1 Occipital Ctx 


11.1 


AD 4 Hippo 


11.3 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


763 


AD 3 Occipital Ctx 


2.3 


AD 6 Hippo 


38.7 


AD 4 Occipital Ctx 


19.8 


Control 2 Hippo 


17.8 


AD 5 Occipital Ctx 


45.4 


Control 4 Hippo 


3.9 


AD 6 Occipital Ctx 


21.2 


Control (Path) 3 Hippo 


1.0 


Control 1 Occipital Ctx 


0.9 


AD 1 Temporal Ctx 


7.6 


Control 2 Occipital Ctx 


82.4 
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rvJU' z i emporai l,ix 




Control 3 Occipital Ctx 


1314 


ajl' 3 l emporaj ctx 


4.U 


Control 4 Occipital Ctx 


0.0 


iwj *r l e nip oral \Jix 


jZ.j 


Control (ratn) 1 Occipital Ctx 


100.0 


j\u j Mil i emporai ctx 


/5.0 


Control (rath) 2 Occipital Ctx 


17.1 


/\u j oup i emporai ctx 


TZ 1 
Aj.D 


Control (Path) 3 Occipital Ctx 


0.0 


AFl TrrP Tomnnrnl fV-v 

ni/ o in 1 1 emporai ctx 




Control (rath) 4 Occipital Ctx 


31.9 


AD 6 Sud Temnoral Ctx 


71.7 


fYvntml 1 Parietal f*tv 


I.o 


Control 1 Temporal Ctx 


4.3 


Control 2 Parietal Ctx 


36.9 


Control 2 Temporal Ctx 


33.2 


Control 3 Parietal Qx 


21.5 


Control 3 Temporal Ctx 


13.8 


Control (Path) 1 Parietal Ctx 


87.1 


Control 3 Temporal Ctx 


2.5 


Control (Path) 2 Parietal Ctx 


41.5 


Control (Path) 1 Temporal Ctx 


55.9 


Control (Path) 3 Parietal Ctx 


3.7 


Control (Path) 2 Temporal Ctx 


65.1 


Control (Path) 4 Parietal Ctx 


79.0 



Table AOC. Panel 5 Islet 



Tissue Name 


ReL 

Exp.(%) 
Ag7558, 
Run 

312000203 


Tissue Name 


Rel. 

Exp.(%) 
A£7558, 
Run 

312000203 


97457_Patient-02go_adipose 


0.0 


94709 JDonor 2 AM - A_adipose 


0.0 


97476_Patient-G7slcskeletal 
muscle 


0.0 


94710JDonor 2 AM - B_adipose 


0.0 


97477_Patient-07ut_uterus 


0.0 


9471 l_Donor 2 AM - C_adipose 


0.0 


97478_Patient-07pl_placenta 


0.0 


94712_Donor 2 AD - A_adipose 


0.0 


99167_Bayer Patient 1 


100.0 


94713_Donor 2 AD - B_adipose 


0.0 


97482_J>atient-08ut_uterus 


0.0 


94714JDonor 2 AD - C_adipose 


|0.0 . 


97483JPatient-08pLplacenta 


0.0 


94742_Donor 3 U - A_Mesenchymal 
Stem Cells 


0.0 


97486_Patient-09sk_skeIetal 
muscle 


0.0 


94743_Donor 3 U - BJMesenchymal 
Stem Cells 


0.0 


97487_PatienM)9uUiterus 


0.0 


94730 JDonor 3 AM - A_adipose 


0.0 


97488JPatient-09pLplacenta 


0.0 


9473 lJDonor 3 AM - B.adipose 


0.0 


97492 J'atient-lOuUiterus 


0.0 


94732_J>onor 3 AM - C_adipose 


0.0 


97493 JPatient-1 OpLplacenta 


0.0 


94733_Donor 3 AD - A_adipose 


0.0 


97495_Patient-l lgo_adipose 


0.0 


94734_Donor 3 AD - B_adipose 


0.0 


97496_Patient-l lsl^skeletal 
muscle 


0.0 


94735JDonor 3 AD - C_adipose 


0.0 


97497JPatient-llut_uterus 


0.0 


77138JLiverJIepG2untreated 


0.0 


97498_Patient-l lpLplacenta 


0.0 


73556_Heart_ - Cardiac stromal cells 
(primary) 


0.0 


97500_J > atient-12go_adipose 


0.0 


81735_Small Intestine 


0.0 
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97501JPatient-12sk_skeletal 
muscle 


0.0 


72409JKidnl^WkimMBi^lm^" 
Tubule 


pi JL «;f ./ . 
0.0 


97502 JPatient-12ut_uterus 


0.0 


82685_Small intesti'ne_Duodenum 


0.0 


s / <J\Jj_jr auciii- IiC.pi piaCCIiia 


n n 
u.u 


90650_Adrenal_Adrenocortical 
adenoma 


0.0 


94721JDonor2U- 
AJVIesenchymal Stem Cells 


0.0 


72410_JKidneyJHRCE 


0.0 


94722_Donor2U- 
BJtfesenchymal Stem Cells 


0.0 


72411JKidney_HRE 


0.0 


94723J*>nor2U- 
CJVlesenchymal Stem Cells 


0.0 


73139_Uterus_Uterine smooth 
muscle cells 


0.0 



CNS_neurodegeneration_vl.O Summary: Ag7558 Highest expression of this 
gene is seen in the occipital cortex of a control patient (CT=33). This panel does not show 
differential expression of this gene in Alzheimer's disease. However, this profile does show 
the expression of this gene at low levels in the brain. Therefore, therapeutic modulation of 
the expression or function of this gene may be useful in the treatment of neurological 
disorders, such as Alzheimer's disease, Parkinson's disease, schizophrenia, multiple 
sclerosis, stroke and epilepsy. 

Panel 4.1D Summary: Ag7558 Expression of this gene is low/undetectable in all 
samples on this panel (CTs>35). 

Panel 5 Islet Summary: Ag7558 Expression of this gene is limited to pancreatic 
islet cells (CT=34.6). This gene codes for a variant of SUR1. SUR1 is a subunit of the 
pancreatic beta cell K+ channel that regulates insulin release in glucose-stimulated cells. 
Thus, therapeutic modulation of SUR1 variant encoded by this gene may be used as a 
treatment for the enhancement of insulin secretion in Type 2 diabetes. 

AR. CG93659-03: MTTOGEN-ACTIVATED PROTEIN 
KINASE KINASE KINASE 9. 

Expression of gene CG93659-03 was assessed using the primer-probe set Ag4828, 
described in Table ARA. Results of the RTQ-PCR runs are shown in Tables ARB and 
ARC. 

Table ARA. Probe Name Ag482S 
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Forward 


5 ' -gaggaatctgagatgctcaaga-3 ' . 








Probe 


TET-5 ' -caacgctctctctacatcgacctcgg 
-3 * -TAMRA 


26 


1299 


421 


Reverse 


5 1 -tccccgaacaagattgaagt-3 1 


20 


1339 


422 



Table ARB. General screening panel vl.4 



5 



Tissue Name 


Rel. 

Exp.(%) 
Ag4828, 
Run 

217081802 


issue Name 


Rel. 

Exp.(%) 
Ag4828, 
Run 

217081802 


Adipose 


53.6 


Renal ca.TK-10 


10.6 


Melanoma* Hs688(A).T 


15.5 


Bladder 


31.9 


Melanoma* Hs688(B).T 


17.4 


Gastric ca. (liver met.) NCI-N87 


36.3 


Melanoma* M14 


3.5 


Gastric ca. KATO m 


12.2 


Melanoma* LOXIMVI 


3.2 


Colon ca. SW-948 


5.4 


Melanoma* SK-MEL-5 


0.9 


Colon ca. SW480 


25.0 


Squamous cell carcinoma SCC-4 


7.0 


Colon ca.* (SW480 met) SW620 


25 


Testis Pool 


4.7 


Colon ca. HT29 |14.3 


Prostate ca.* (bone met) PC-3 


6.3 


Colon ca.HCT-1 16 


2.1 


Prostate Pool 


3.9 


Colon ca. CaCo-2 


15.9 


Placenta 


39.0 


Colon cancer tissue 


39.8 


Uterus Pool 


9.0 


Colon ca.SW1116 


3.4 


Ovarian ca. OVCAR-3 


15.7 


Colon ca. Colo-205 


8.8 


Ovarian ca. SK-OV-3 


46.3 


Colon ca.SW-48 


5.4 


Ovarian ca. OVCAR-4 


7.1 


Colon Pool 


16.2 


Ovarian ca. OVCAR-5 


30.6 


Small Intestine Pool 


9.3 


Ovarian ca.IGROV-1 


14.1 


Stomach Pool 


17.3 


Ovarian ca. OVCAR-8 


2.7 


Bone Marrow Pool 


7.0 


Ovary 


4.5 


Fetal Heart 


2.9 


Breast ca. MCF-7 


100.0 


Heart Pool 


7.9 


Breast ca. MDA-MB-231 


9.2 


Lymph Node Pool 


15.2 


Breast ca. BT549 


73.2 


Fetal Skeletal Muscle 


1.7 


Breast ca.T47D 


66.0 


Skeletal Muscle Pool 


9.8 


Breast ca.MDA-N 


0.9 


Spleen Pool 


45.7 


Breast Pool 


24.1 


Thymus Pool 


15.9 


Trachea 


18.0 


CNS cancer (glio/astro) U87-MG 


7.6 


Lung 


6.7 


CNS cancer (glio/astro) U-l 18-MG 


7.9 


Feta] Lung 


68.3 


CNS cancer (neuro;met) SK-N-AS 


2.6 


Lung ca. NCI-N417 


D.2 


CNS cancer (astro) SF-539 


2.3 


Lung ca. LX-1 


11.8 


CNS cancer (astro) SNB-75 


14.1 


Lung ca. NCI-H146 


3.0 |CNS cancer felio) SNB-19 


Li. l ! 
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.Lung ca. oxlr- / / 


A 1 
\J. 1 


L^iNo cancer vgnu^^>jr-z^D *^.a 




Lung ca. A549 


JO.O 


Jorain (Amyguaia^ rooi 


1 7 


T irrto- MPT 

JLung ca. 


U-U 


r>ram ^cereoejiurnj 




JLung ca. isui-HZJ 


1 ?. A 

13.4 


Jbram ^letaij 




Lung ca. IN LJL-rjW- OU 


17 A 


Brain (Hippocampus) Pool 


'X 7 


JLung Ca. xiUJr-DZ 


ID.L 


v_^reDrai i^onex irooi 


D.J 


Lung ca. Inui-xIjzz 


1 1 

IT 


£5 rain ^ouostanua nigra j rooi 


7 7 
C. f 


Liver 


i n 


rsram v i naiamusj rooi 


4-. J 


retal Liver 


Z.O 


Brain (whole) 




Liver ca. riepvjZ 


Q 1 


opmai i^ora rooi 




Kidney Pool 


31.4 


Adrenal Gland 


9.5 


Fetal Kidney 


7.7 


Pituitary gland Pool 


1.4 


Renal ca. 786-0 


10.9 


Salivary Gland 


2.5 j 


Renal ca.A498 


5.2 


Thyroid (female) 


7.7 


Renal ca. ACHN 


2.5 


Pancreatic ca. CAPAN2 ' 


34.4 


Renal ca.UO-31 


14.9 


Pancreas Pool 


19.6 


Table ARC. Panel 5D 


Tissue Name 


Rel. 

Exp. %) 
Ag4828, 
Run 

219436967 


Tissue Name 


Rel. 

Exp.(%) 
Ag4828, 
Run 

219436967 


97457 JPatient-02go_adipose 


33.9 


94709 JDonor 2 AM - A_adipose 


10.8 


97476_Patient-07sk._skeletal 
muscle 


33.4 


94710JDonor 2 AM - B_adipose 


9.3 


97477 JPatient-07ut_.uterus 


59.5 


9471 IJDonor 2 AM - C_adipose 


3.0 


97478 JPatienM)7pLplacenta 


39.8 


94712 J>onor 2 AD - A_adipose 


13.7 


9748 1 J?atient-08sk_skeletal 
muscle 


25.9 


94713 J>onor 2 AD - B_adipose 


10.0 


97482J > atienM)8ut_uterus 


19.8 


94714_Donor 2 AD - C__adipose 


6.7 


97483_Patient-08pl_placenta 


41.5 


94742JDonor 3 U - AJVTesenchymal 
Stem Cells 


4.7 


97486JPatient-09sk^skeletal 
muscle 


6.5 


94743_Donor 3 U - B JVIesenchymal 
Stem Cells 


2.8 


97487 JPatient-09ut_uterus 


8.1 


94730_Donor 3 AM - A_adipose 


6.3 


97488JPatient-09pLplacenta 


38.4 


94731_Donor 3 AM - B_adipose 


2.4 


97492^06^-10^.^61115 


30.6 


94732_Donor 3 AM - C_adipose 


2.2 


97493 JPatient-lOpLjrfacenta . 


12.7 


94733 JDonor 3 AD - A__adipose 


10.2 


97495_Patient-l lgo_adipose 


100.0 


94734 JDonor 3 AD - B_adipose 


5.5 


97496JPa0ent-l lsk_skeletal 
muscle 


5.8 


94735 JDonor 3 AD - C_adipose 


4.7 


97497 JPaOent-1 lut__uterus 


20.6 


77138JLiver_HepG2untreated 


K4 
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97495? Patient- 11 nl nlarAiita 

? / *-t^o ± diicii l-j. j. pi — piacema 


jU.U 


73556^earPd«^aW S ^ 
(primary) 


B137: 

1.9 


97500JPatient-12go_adipose 


82.4 


81735JSmall Intestine 


17.2 


97501JPatient-12sk_skeletal 
muscle 


19.2 


72409 JKidneyJProximal Convoluted 
Tubule 


0.9 


97502JPatient-12ut_uterus 


23.7 


82685_Small intestineJDuodenum 


19.1 


97503 Patient- 19n1 nlar^nta 


J /.if 


90650_Adrenal_Adrenocortical 
adenoma 


8.8 


94721JDonor2U- 
A_Mesertchymal Stem Cells 


1.6 


72410_Kidney__HRGE 


7.6 


94722J)onor2U- 
BJtfesenchymal Stem Cells 


3.0 


72411_KidneyJHRE 


13.5 


94723 JDonor2U- 
CJMesenchyrnal Stem Cells 


2.1 


73139_UterusJJteririe smooth 
muscle cells 


2.0 



GeneraLscreenin^paneLvl.4 Summary: Ag4828 Highest expression of this 
gene is detected in a breast cancer MCF-7 cell line(CT=27.6). Interestingly, this gene is 
expressed at much higher levels in fetal (CT=28) when compared to adult lung (CT=31). 
This observation suggests that expression of this gene can be used to distinguish fetal from 
adult lung. In addition, the relative overexpression of this gene in fetal lung suggests that 
the protein product may enhance lung growth or development in the fetujs and thus may 
also act in a regenerative capacity in the adult. Therefore, therapeutic modulation of the 
protein encoded by this gene could be useful in treatment of lung related diseases. 

In addition significant expression of this gene is found in a number of cancer 
(pancreatic, CNS, colon, lung, breast, ovary, prostate, melanoma) cell lines. Therefore, 
therapeutic modulation of the activity of this gene or its protein product, through the use of 
small molecule drugs, might be beneficial in the treatment of these cancers. 

Among tissues with metabolic or endocrine function, this gene is expressed at high 
to moderate levels in pancreas, adipose, adrenal gland, thyroid, skeletal muscle, heart, fetal 
liver and the gastrointestinal tract. Therefore, therapeutic modulation of the activity of this 
gene may prove useful in the treatment of endocrine/metabolically related diseases, such as 
obesity and diabetes. 

This gene encodes a protein that is homologous to niitogen-acti vated protein kinase 
kinase kinase 8 (MAP3K8)(COT proto-oncogene serine/threonine-protein kinase) (C-COT) 
(Cancer osaka thyroid oncogene). COT is able to enhance the TNF alpha production and to 
activate NF-UB. Both events are connected with insulin resistance and type II diabetes (1, 
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2, 3). Inhibition of COT kinase would prevent overproducing cF It¥ S^Bk^aniJ jS^altai? 3 
of NF-kB, thus improving insulin resistance and diabetes. 

In addition, this gene is expressed at high levels in all regions of the central nervous 
system examined, including amygdala, hippocampus, substantia nigra, thalamus, 
cerebellum, cerebral cortex, and spinal cord. Recently, MKK6, a related protein, has been 
shown to associated with Alzheimer's disease (4). Therefore, based on the homology of this 
protein to MKK6 and the presence of this gene in the brain, we predict that this putative 
MAP3K8 may play a role in central nervous system disorders such as Alzheimer's disease, 
Parkinson's disease, epilepsy, multiple sclerosis, schizophrenia and depression. 

References: 

1. Ballester A, Velasco A, Tobena R, Alemany S. Cot kinase activates tumor 
necrosis factor-alpha gene expression in a cyclosporin A-resistant manner. J. Biol. Chem. 
1998. 273, 14099-106. PM1D: 9603908. 

2. Bierhaus A, Schiekofer S, Schwaninger M, Andrassy M, Humpert PM, Chen J, 
Hong M, Luther T, Henle T, Kloting I, Morcos M, Hofmann M, Tritschler H, Weigle B, 
Kasper M, Smith M, Perry G, Schmidt AM, Stern DM, Haring HU, Schleicher E, Nawroth 
PP. Diabetes-associated sustained activation of the transcription factor nuclear 
factor-kappaB, Diabetes, 2001 50, 2792-808. PMID: 11723063. 

3. Belich MP, Salmeron A, Johnston LH, Ley SC. TPL-2 kinase regulates the 
proteolysis of the NF-kappaB-inhibitory protein NF-kappaBl pl05. Nature. 1999 397, 
363-8.PMID: 9950430. 

4. Zhu X, Rottkamp CA, Hartzler A, Sun Z> Takeda A, Boux H, Shimohama S, 
Perry G, Smith MA. (2001) Activation of MKK6, an upstream activator of p38, in 
Alzheimer's disease. J Neurochem 79(2):31 1-8 

Panel 5D Summary: Ag4828 Highest expression of this gene is detected in 
adipose tissue (CT=29). Low to moderate expression of this gene is seen in wide range of 
samples used in this panel including adipose, skeletal muscle, uterus, and placenta. This 
wide spread expression of this gene in tissues with metabolic or endocrine function, 
suggests that this gene plays a role in endocrine/metabolically related diseases, such as 
obesity and diabetes. 

This gene encodes a MAP3K8-like protein. Recently, activation of MAP kinase, 
ERK, a related protein, by modified LDL in vascular smooth muscle cells has been 
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implicated in the development of atherosclerosis in diabel&^f.lf)?^ 
putative MAP3K8 may also play a role in the development of this disease. Therefore, 
therapeutic modulation of the activity of this gene or its protein product, through the use of 
small molecule drugs, might be beneficial in the treatment of artherosclerosis and diabetes. 
References: 

1 . Velarde V, Jenkins AJ, Christopher J, Lyons TJ, Jaffa AA. (2001) Activation of 
MAPK by modified low-density lipoproteins in vascular smooth muscle cells. J AppI 
Physiol 9 1(3): 1412-20 

AS. CG94521-02 and CG94521-03: CYTOPLASMIC 
GLYCEROL-3-PHOSPHATE DEHYDROGENASE [NAD+]. 

Expression of gene CG94521-02 and CG94521-03 was assessed using the 
primer-probe set Ag3924, described in Table ASA. Results of the RTQ-PCR runs are 
shown in Tables ASB, ASC, ASD, ASE and ASF. Please note that these sequences 
represent full-length physical clones. 

Table ASA. Probe Name Ag3924 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -actgggaagaccattgaagagt-3 1 


22 


197 


423 


Probe 


TET-5 ' -aaaagctccaaggaccgcagacttct 
-3 1 -TAMRA 


26 


147 


424 


Reverse 


5 ' -gtttgaggatgcggtacactt-3 ' 


21 


122 


425 



Table ASB. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

212343350 


issue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

212343350 


AD 1 Hippo 


8.4 


Control (Path) 3 Temporal Ctx 


6.0 


AD 2 Hippo 


21.9 


Control (Path) 4 Temporal Ctx 


2.8 


AD 3 Hippo 


8.4 


AD 1 Occipital Ctx 


14.4 


AD 4 Hippo 


7.5 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 hippo 


92.7 


AD 3 Occipital Ctx 


4.8 


AD 6 Hippo 


24.5 


AD 4 Occipital Ctx 


14.0 



520 



WO 03/029424 



PCT/US02/31373 



control /* juppo 


ZD./ 


lAD5oJJrd£ ytJ3 ° B ^ 

AD D <jccipitai L,iA 


14.0 


^onuroi h- jiippo 


7 1 


AD o L/ccipiiai ux 


jj.j 


t-.OuirOJ {rala) J JtlippO 


o.o 


control x uccipitai ctx 


o.l. 


rixj i i emporai crx 


a 3 

O.J 


control z uccrpiiai ctx 


ah i 


i\d z i emporaj kax 




control o uccipital ctx 




ad j temporal ctx 


A 9 


control 4 uccipnai cix 


4.3 


au h l emporai ctx 




control ^jratry i uccipital ctx 


04. 0 


au j mi i emporai ctx 


1 (\(\ A 


control (ratn) z occipital Ctx 


o.O 


au d k>up i emporai ctx 




control (rauij J uccipital Ctx 




AD o Jni i emporai ctx 


jy.o 


Control (ratn) 4 Uccipital Ctx 


15.5 


ad o oup i emporai v^ia 




control i ranetai v^tx 




Control 1 Temporal Ctx 


4.5 


Control 2 Parietal Ctx 


40.3 


Control 2 Temporal Ctx 


44.4 


Control 3 Parietal Ctx 


14.6 


Control 3 Temporal Ctx 


11.1 


Control (Path) 1 Parietal Ctx 


70.7 


Control 4 Temporal Ctx 


4.4 


Control (Path) 2 Parietal Ctx 


15.5 


Control (Path) 1 Temporal Ctx 


49.0 


Control (Path) 3 Parietal Ctx 


4.9 


Control (Path) 2 Temporal Ctx 


29.9 


Control (Path) 4 Parietal Ctx 


39.5 



Table ASC. General screening panel vl.4 



Tissue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

219515221 


issue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

219515221 


Adipose 


14.0 


Renal ca. TK-10 


7.1 


Melanoma* Hs688(A).T 


3.6 


Bladder 


8.1 


Melanoma* Hs688(B).T 


4.9 


Gastric ca. (liver met.) NCI-N87 


7.7 


Melanoma* M14 


15.1 


Gastric ca. KATO HI 


17.4 


Melanoma* LOXMVI 


6.2 


Colon ca. SW-948 


25.5 


Melanoma* SK-MEL-5 


37.6 


Colon ca. SW480 


28.3 


Squamous cell carcinoma SCC-4 


1.1 


Colon ca* (SW480 met) SW620 


6.6 


Testis Pool 


6.3 


Colon ca. HT29 


4.1 


Prostate ca.* (bone met) PC-3 


47.0 


Colon ca.HCT-1 16 


25.0 


Prostate Pool 


18.6 


Colon ca. CaCo-2 


6.9 


Placenta 


6.3 


Colon cancer tissue 


7.6 


Uterus Pool 


5.1 


Colon ca. SW1 116 


5.2 


Ovarian ca. OVCAR-3 


11.3 


Colon ca. Colo-205 


2.6 


Ovarian ca. SK-OV-3 


6.8 


Colon ca. SW-48 


4.4 


Ovarian ca. OVCAR-4 


12.2 


Colon Pool 


£.9 


Ovarian ca.OVCAR-5 


17.9 


Small Intestine Pool 


93 


Ovarian ca.IGROV-1 


8.2 


Stomach Pool 


5.2 


Ovarian ca. OVCAR-8 


3.5 


Bone Marrow Pool 


4.9 
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Ovary 




^taJHeart PCTy ' 03 °5^ 

PclaJ JieaTL 


Zu. i 


Breast ca MfF-7 




xieaix row 


ZJ. / 


Breast ca MD A -MB -9^1 


1 1 4 


T vmnh Wrwlp Pnnl 


52 9 


Breast ca. BT 549 


11.4 


Petal SVpWal Muscle 


119 
ll.Z 


Breast ca T47D 


40 Q 


^Vplptfll A/fncr-lp Pnnl 

OJK.ClCLa.1 lVXUol_iC rUwl 


£9 n 


Breast ca MDA-N 


11 7 


ojjiecii ruui 


Q 7 


Breast Pnnl 


8 3 


T nvmiic Prtf^l 

i lijiuuo ruui 


R 

J.O 


Trachea 


15 4 


v^xNO CdllCCl v£"CVaoU.tJ/ yJO / ~i.VJ.kJ 


lo.Z 


T imp" 


9 R 

£••0 


v_j.no cancer ^gno/asiTuy u-iio-iYio 


111 
11.3 


Fetal T.iTn<r 


91 R 


Uij cancer ^neurOjUiey oiY-rN -/Vo 


o.o 




1^ 4 


PXfQ ran^Ar factrr\\ QTT-^^IO 




T imp ra T "3f-1 


R 9 


ulno cancer ^asrroy oin 15-/0 


Ol o 

zi.y 




4 5 


cancer ^guoj oiYD-iy 


7 


T uncr ra QTTP-77 
j~«ung Ca. OXXl ~/ / 


1^ ^ 


CNS cancer (glio) SF-295 


24.0 


T lino pa A ^ztO 
Ivullg la. rxJ'-rTf 


IO.O 


Brain (Amygdala) Pool 


11.4 




9 4 


Brain (cerebellum) 


10.2 


T imp ca 




Brain (fetal) 


27.2 




9 Q 


Brain (ffippocarnpus) Pool 


11.6 


j_<uiig ca. riwr -uz 


O.D 


Cerebral Cortex Pool 


17.2 


T una ra 7MPT IT*?99 


14 *X 
14.J 


Brain (Substantia nigra) Pool J 10.4 


T ivpr 




Brain (Thalamus) Pool 


18.9 


jTClai xjIVCI 


1 1 

X.I 


Brain (whole) 


17.7 


Liver ca. HepG2 


3.4 


Spinal Cord Pool 


14.3 


Kidney Pool 


26.4 


Adrenal Gland 


37.9 


Fetal Kidney 


6.7 


Pituitary gland Pool 


5.0 


Renal ca. 786-0 


3.0 


Salivary Gland 


11.1 


Renal ca. A498 


1.4 


Thyroid (female) 


17.0 


Renal ca. ACHN 


2.5 


Pancreatic ca. CAPAN2 


2.8 


Renal ca. UO-31 


10.1 


Pancreas Pool 


13.3 



Table ASP. Panel 4.1P 



Tissue Name 


Rel. 

Exp.(% 
Ag3924, 
Run 

170552351 


Tissue Name 


Rel. 

Exp.(%) 
Ag3924, 
Run 

170552351 


Secondary Thl act 


33.9 


HUVECIL-lbeta 


19.6 


Secondary Th2 act 


35.4 


HUVECIFN gamma 


32.3 


Secondary Trl act 


29.3 


HUVEC TNF alpha + IFN gamma 


8.6 


Secondary Thl rest 


14.8 


HUVEC TNF alpha + IL4 


^.l 


Secondary Th2 rest 


23.7 


HUVEC BL-11 


17.2 


Secondary Trl rest 


15.8 


Lung Microvascular EC none 


16.8 
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a miidiy xiix act 


11 n 


LungMicrolaklM'W^a 
+ IL-lbeta ' 


11.0 


Primary Th2 act 


33.7 


Microvascular Dermal EC none 


27.7 


x XlLllaly XL 1 d.LL 


m o 


Microsvasular Dermal EC 
TNFalpha* IL-lbeta 


8.6 


jTiiixiaiy nil rest 


LI. 1 * 


Bronchial epithelium TNFalpha + 
ILlbeta 


6.7 


Primary Th2 rest 


15.3 


Small airway epithelium none 


4.7 


Primary Trl rest 


34.2 


Small airway epithelium TNFalpha 
+ IL-lbeta 




CD45RA CD4 lymphocyte act 


17.4 


Coronery artery SMC rest 


8.1 


CD45RO CD4 lymphocyte act 


28.3 


Coronery artery SMC TNFalpha + 
IL-lbeta 


4.4 


CD8 lymphocyte act 


24.1 


Astrocytes rest 


16.4 


Secondary CD8 lymphocyte rest 


18.2 


Astrocytes TNFalpha + IL-lbeta 


11.9 


Secondary CD8 lymphocyte act 


15,2 


KU-812 (Basophil) rest 


37.1 


CD4 lymphocyte none 


12.8 


KU-8 12 (Basophil) 
PMA/ionomycin 


35.6 


2ry Thl/Th2/TrUanti-CD95 
CH11 


21.0 


lajl/iiuo viveraunocyiesj none 


9.5 


LAK cells rest 


17.8 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


4.8 


LAK cells IL-2 \26.6 


Liver cirrhosis 


14.4 


LAK cells IL-2+IL-12 


17.8 


NCI-H292 none 


42.9 


LAK cells EL-2+IFN gamma 


17.8 


NCI-H292 IL-4 


57.0 


LAK cells EL-2+ EL- 18 


32.5 


NO-H292 IL-9 


81.2 


LAK cells PMA/ionomycin 


7.9 


NCI-H292IL-13 


60.7 


NK Cells EL-2 rest 


35.6 


NCI-H292 IFN gamma 


39.0 


Two Way MLR 3 day 


17.3 


HPAEC none 


21.2 


Two Way MLR 5 day 


17.1 


HPAEC TNF alpha + IL-1 beta 


13.4 


Two Way MLR 7 day 


100.0 


Lung fibroblast none 


18.0 1 


PBMCrest 


15.6 


Lung fibroblast TNF alpha + IL-1 
beta 


6.0 


PBMCPWM 


16.5 


Lung fibroblast IL-4 


19.5 


PBMC PHA-L 


13.8 


Lung fibroblast IL-9 


30.8 


Ramos (B cell) none 


54.6 ] 


Lung fibroblast BL-13 


22.2 


Ramos (B cell) ionomycin 


70.2 


-rung fibroblast IFN gamma \ 


20.0 


B lymphocytes PWM : 


23.8 iDermal fibroblast CCD1070 rest 12.5 


B lymphocytes CD40L and IL-4 3 


L7.0 1 


Dermal fibroblast CCD1070 TNF , 
ilpha 


M).l 


EOL-1 dbcAMP 


10.8 j 


Dermal fibroblast CCD1070 IL-1 ^ 
>eta 


i-4 


EOL-1 dbcAMP 

PMA/ionomycin d 


1.2 I 


Dermal fibroblast IFN gamma 8 


.2 


Dendritic cells none 1 


3.6 I 


)ermal fibroblast IL-4 1 


7.8 


Dendritic cells LPS , 4 


5 I 


)ermal Fibroblasts rest 2 


0.0 
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• 

i/enanuc ceus slqu-k^u^kj 


71 fk 
Zl.D 


Neutrophil^fSCFaCEP^ 




2$ - 


I! 


Monocytes rest 


19.8 


Neutrophils rest 


3.6 




Monocytes LPS 


3.0 


Colon 


35.6 




Macrophages rest 


14.9 


Lung 


27.7 




Macrophages LPS 


1.7 


Thymus 


27.7 




HUVEC none 


16.7 


Kidney 


66.4 




fflJVEC starved 


17.7 









Table ASE. Panel 5 Islet 



Tissue Name 


Rel. 

Exp.(% 

Ag3924, 

IVUQ 

268363571 


Tissue Name 


Rel. 

Exp.(%) 
Ag3924, 

IV II 11 

268363571 


97457JPatient-02go_adipose 


18.2 


94709_Donor 2 AM - A_adipose 


19.6 


y f^t/ o__x anen i-u / SK^SKeiecai 
muscle 


10.6 


94710_Donor 2 AM - B_adipose 


13.3 


97477 JPatient-07ut_uterus 


10.2 


9471 l_Donor 2 AM - C_adipose 


11.0 




17 0 


04719 Donor 9 AD -A »dtnn«<» 


9 5 


QQ167 Ravpr Patif^rtf 1 


V/.J 


0471^ Donor 9 AD -R »nino<M» 


71 0 1 


97482JPatient-08ut_uterus 


6.8 


94714JDonor 2 AD - C_adipose 


16.7 


97483^Patient-08pljplacenta 


11.7 


94742_Donor 3 U - AJvlesenchymal 
Stem Cells 


1.8 


y /4oo_rauenc-uysjc__sKeieiai 
muscle 


10.6 


ft+D^jJOTiOT d u - r>__iviesencnymai 
Stem Cells 


1.7 


97487JPatient~09ut_uterus 


12.0 


94730_Donor 3 AM - A_adipose 


19.6 


97488JPatient-09pl_pIacenta 


15.4 


94731JDonor 3 AM - B__adipose 


12.5 


97492JPatienM0ut_uterus 


12.9 


94732_Donor 3 AM - C_adipose 


12.2 


97493 JPatient-lOpLplacenta 


29.5 


94733 _J>onor 3 AD - A_adipose 


10.2 


97495 JPatient-1 lgo_adipose 


17.9 


94734JDonor 3 AD - B^adipose 


9.2 


97496 JPatient-1 1 sk_skeletal 
muscle 


70.7 


94735 _Donor 3 AD - C_adipose 


8.9 


97497 J^tient-llut.uterus 


18.8 


77138_LiverJIepG2untreated 


11.1 


97498 JPatient-1 lpLplacenta 


10.3 


73556_Heart_Cardiac stromal cells 
(primary) 


5.2 


97500JPatient-12go_adipose 


31.9 


81735JSmall Intestine 


15.9 


97501_Patient-12st.skeletal 
muscle 


100.0 


72409 JKidneyJYoximal Convoluted 
Tubule 


6.5 


97502J , atient-12ut_uteras 


23.8 


82685 JSmall intestine J>uodemim 


17.0 


97503 JPatient-12pLplacenta 


8.7 


90650_AdrenaLAdrenocortical 
adenoma 


14.4 


94721J)onor2U- 
AJMesenchymal Stem Cells 


3.9 


72410JCidneyJHRCE 


11.5 
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• 

94722JDonor2U- 
B_Mesenchymal Stem Cells 


2.8 


72411JKidney_HRE 


3.4 


94723 JDonor2U- 
C_Mesenchymal Stem Cells 


4.8 


73139_Uterus_Uterine smooth 
muscle cells 


2.1 



Table ASF, general oncology screening panel v 2.4 

5 





Rel. 

TTl / fit \ 

Exp.(%) 
Run 

268143856 




Rel. 

Tr* „ f grf \ 

Exp.(%) 

Ap3924 

Run 

268143856 


Colon cancer 1 


60.3 


Bladder NAT 2 


|3.3 


Colon NAT 1 


29.7 


Bladder NAT 3 


2.4 


Colon cancer 2 


26.1 


Bladder NAT 4 


25.7 


Colon NAT 2 


60.7 


Prostate adenocarcinoma 1 


100.0 


Colon cancer 3 


88.9 


Prostate adenocarcinoma 2 


14.6 


Colon NAT 3 


88.9 


Prostate adenocarcinoma 3 


86.5 


Colon malignant cancer 4 


98.6 


Prostate adenocarcinoma 4 


34.9 


Colon NAT 4 


29.5 


Prostate NAT 5 


26.2 


Lung cancer 1 


17.3 


Prostate adenocarcinoma 6 


24.5 


Lung NAT 1 


7.9 


Prostate adenocarcinoma 7 


39.5 


Lung cancer 2 


31.9 


Prostate adenocarcinoma 8 


15.2 


Lung NAT 2 


14.8 


Prostate adenocarcinoma 9 


53.6 


Squamous cell carcinoma 3 


34.2 


Prostate NAT 10 


12.6 


Lung NAT 3 


5.0 


Kidney cancer 1 


12.0 


Metastatic melanoma 1 


28.3 


Kidney NAT 1 


25.9 


Melanoma 2 


4.8 


Kidney cancer 2 


53.6 


Melanoma 3 


12.9 


Kidney NAT 2 


64.6 


Metastatic melanoma 4 j 


42.6 


Kidney cancer 3 


12.5 


Metastatic melanoma 5 


70.7 


Kidney NAT 3 


26.6 


Bladder cancer 1 


9.3 


Kidney cancer 4 


15.0 


Bladder NAT 1 


0.0 


Kidney NAT 4 


14.6 ! 


Bladder cancer 2 


17.7 







CNS_neurodegeneration_vl.O Summary: Ag3924 This panel does not show 
differential expression of this gene in Alzheimer's disease. However, this profile confirms 
10 the expression of this gene at moderate levels in the brain. Please see Panel 1.4 for 
discussion of this gene in the central nervous system. 

General jscreening_panel_vl*4 Summary: Ag3924 Highest expression of this 
gene is seen in a breast cancer cell line (CT=25.3). This gene is ubiquitously expressed in 
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this panel, with high to moderate expression seen in braiifcfJlofi, 
ovarian, and melanoma cancer cell lines. This expression profile suggests a role for this 
gene product in cell survival and proliferation. Modulation of this gene product may be 
useful in the treatment of cancer. 
5 Among tissues with metabolic function, this gene is expressed at moderate to high 

. levels in pituitary, adipose, adrenal gland, pancreas, thyroid, and adult and fetal skeletal 
muscle, heart, and liver. This widespread expression among these tissues suggests that this 
gene product may play a role in normal neuroendocrine and metabolic function and that 
disregulated expression of this gene may contribute to neuroendocrine disorders or 

10 metabolic diseases, such as obesity and diabetes. This gene encodes a novel glycerol 
3-phosphate dehydrogenase (G3PD). 

Similar to known cytosolic glycerol 3-phosphate dehydrogenase, this putative 
G3PD may contribute to glycerol synthesis and link glycolysis with TG production. This 
gene is highly expressed in skeletal muscle and diabetic skeletal muscle on Panel 51. 

15 Diabetic skeletal muscle has increased glycolytic activity and increased lipid content that 
interfere with insulin sensitivity. Inhibition of G3PD may balance disproportionate 
glycolysis and impair accumulation of TG in skeletal muscle. Thus, an antagonist of this 
novel G3PD may be beneficial for the treatment of diabetes. 

This gene is also expressed at high to moderate levels in the CNS, including the 
20 hippocampus, thalamus, substantia nigra, amygdala, cerebellum and cerebral cortex. 

Therefore, therapeutic modulation of the expression or function of this gene may be useful 
in the treatment of neurologic disorders, such as Alzheimer ! s disease, Parkinson's disease, 
schizophrenia, multiple sclerosis, stroke and epilepsy. 

In addition, this gene is expressed at much higher levels in fetal lung tissue 
25 (CT=27.5) when compared to expression in the adult counterpart (CT=30.5). Thus, 

expression of this gene may be used to differentiate between the fetal and adult source of 
this tissue. 

Panel 4.1D Summary: Ag3924 Highest expression is seen in a sample derived 
from an MLR, where the sample was take 7 days after the reaction (CT=27.6). This gene is 
30 also expressed at high to moderate levels in a wide range of cell types of significance in the 
immune response in health and disease. These cells include members of the T-cell, B-cell, 
endothelial cell, macrophage/monocyte, and peripheral blood mononuclear cell family, as 
well as epithelial and fibroblast cell types from lung and skin, and normal tissues 
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represented by colon, lung, thymus and kidney. This ubi^5niWuipSttfeM^Mif ress^brl- 
suggests that this gene product may be involved in homeostatic processes for these and 
other cell types and tissues. This pattern is in agreement with the expression profile in 
General_screening_paneLvL4 and also suggests a role for the gene product in cell survival 
5 and proliferation. Therefore, modulation of the gene product with a functional therapeutic 
may lead to the alteration of functions associated with these cell types and lead to 
improvement of the symptoms of patients suffering from autoimmune and inflammatory 
diseases such as asthma, allergies, inflammatory bowel disease, lupus erythematosus, 
psoriasis, rheumatoid arthritis, and osteoarthritis. 

10 Panel 5 Islet Summary: Ag3924 Highest expression is seen in skeletal muscle 

from a diabetic patient (patient 12) (CT=28). This panel confirms expression of this gene in 
metabolic tissues including adipose, skeletal muscle and placenta. Please see Panel 1.4 for 
discussion of this gene in metabolic disease. 

general oncology screening panel_v_2.4 Summary: Ag3924 Highest expression 

15 is seen in a prostate cancer sample (CT=28.2). Prominent expression is also seen in 
melanoma samples, as well as in normal and malignant kidney, colon and lung. Thus, 
modulation of this gene may be useful in the treatment of prostate cancer and melanoma, 

AT. CG96613-02 and CG96613-03: Splice variant of PDK1. 

Expression of gene CG96613-02 and CG96613-03 was assessed using the 
20 primer-probe sets Agl778 and Ag5 1 1 0, described in Tables ATA and ATB . Results of the 
RTQ-PCR runs are shown in Tables ATC, ATD, ATE, ATT, ATG and ATH. Please note 
that probe-primer set Agl778 is specific for CG96613-03. 

Table ATA, Probe Name Agl778 

25 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -gattgcccatatcacgtcttta-3 ' * 


22 


1241 


426 


Probe 


TET-5 1 -cgcacaatacttccaaggagacctga 
-3 ' -TAMRA : 


26 


1263 


427 


Reverse 


5 1 -gataactgcatctgtcccgtaa-3 • 


22 


1308 


428 
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Table ATB. Probe Name AgSllO 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -tgtatggcctgcaagatgat-3 1 


20 


559 


429 


Probe 


TET-5 ' - tcattcccacaatggcccagg-3 ' 
-TAMRA 


21 


623 


430 


Reverse 


5 1 -agctctccttgtattcaatcaca-3 ' 


23 


645 


431 



Table ATC.CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Agl778, 
Run 

276596797 


ReL 

Exp.(%) 
Ag5110, 
Run 

226442922 


ReL 
Exp.(%) 
AgSllO, 
Run 

276596798 


Tissue Name 


Rel. 

Exp.(%) 

Agl / / o, 

Run 

27659679 
7 


ReL 

Exp.(%) 
Run 

22644292 
2 


ReL 

Exp.(.%) 
Run 

27659679 
8 


AD 1 Hippo 


11.7 


6.2 


5.3 


Control 
(Path) 3 
Temporal 
Ctx 


6.6 


12.2 


17.7 


AD 2 Hippo 


31.4 


7.4 


20.3 


Control 
(Path)4 
Temporal 
Ctx 


33.4 


15.8 


13.3 


AD 3 Hippo 


12.5 


5.3 


4.9 


AD 1 

Occipital Ctx 


23.0 


7.7 


8.0 


AD 4 Hippo 


5.4 


9.4 


0.0 


AD 2 

Occipital Ctx 
(Missing) 


0.0 


0.0 


0.0 


AD 5 Hippo 


82.4 


79.0 


45.4 


AD3 

Occipital Ctx 


12.2 


6.2 


5.8 


AD 6 Hippo 


54.3 


88.3 


70.2 


AD4 

Occipital Ctx 


16.3 


18.0 


7.0 


Control 2 
Hippo 


17.9 


18.8 


19.5 


AD 5 

Occipital Ctx 


77.9 


29.9 


26.2 


Control 4 
Hippo 


13.0- 


19.3 


13.3 


AD6 

Occipital Ctx 


36.9 


18.9 


18.8 


Control 
(Path)3 
Hippo 


11.0 


7.5 


16.3 


Control 1 
Occipital Ctx 


6.2 


6.8 ; 


5.4 


AD 1 

Temporal 

Ctx 


20.3 


14.6 


11.0 


Control 2 
Occipital Ctx 


54.0 


44.8 j 


51.4 


AD2 

Temporal 

Ctx 


29.9 


16.6 . 


21.8 


Control 3 
Occipital Ctx 


32.3 


4.9 


26.8 
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AD3 

Temporal 

Ctx 


11.7 


Q A 


IT T 
I /./ 


- — p| 
Control 4 
Occipital Ctx 




»ObS/ 1 


*\ ^ 


AD4 

Temporal 

Ctx 


20.2 




ly.o 


Control 

/T> a tK\ 1 

(Jraui/ a 
Occipital Ctx 


fin ^ 


7A 7 


41 R 

Hi .O 


ADS Inf 
Temporal 
Ctx 


72.2 


47 .U 


AC 1 

40.3 


Control 
{Jraul) Z 
Occipital Ctx 


IZ.o 




O.J 


AD 5 Sup 
Temporal 
Ctx 


39.5 


51.1 


44.1 


Control 
(ratn) i 
Occipital Ctx 


D-D 


n o 


n n 
u.u 


AD6M 
Temporal 
Ctx 


75.3 


O A 1 

84.1 


OA 1 

84.1 


Control 

/T> n +U\ A 

(ratnj 4 
Occipital Ctx 


lo.o 






AD6Sup 
Temporal 
Ctx 


100.0 


100.0 


100.0 


Control 1 
Parietal Ctx 


i a a 


1 A A 




Control 1 
Temporal 
Ctx 


11.2 


10.4 


3.9 


Control 2 
Parietal Ctx 


46.0 


57.U 




Control 2 
Temporal 
Ctx 


25.3 


21.6 


363 


Control 3 
Parietal Qx 


23.5 


lo.3 


lo.o 


Control 3 
Temporal 
Ctx 


31.2 


37.9 


38.2 


Control 
(Path)l 
Parietal Ctx 


78.5 


39.2 


52.5 


Control 3 
Temporal 
Ctx 


11.7 


8.4 


8.8 


Control 
(Path) 2 
Parietal Qx 


23.5 


12.5 


14.9 


Control 
(Path)l 
Temporal 
Ctx 


36.6 


53.6 


46.7 


Control 
(Path) 3 
Parietal Ctx 


9.5 


13.9 


5.8 


Control 
(Path) 2 
Temporal 
Ctx 


46.0 


29.7 


32.5 


Control 
(Path) 4 
Parietal Ctx 


46.0 


58.6 


39.2 



Table ATP. General screening panel _vl.5 



Tissue Name 


Rel. 

Exp.(%) 
Ag5110, 
Run 

228980585 


issue Name 


Rel. 

Exp.(%) 
AgSUO, 
Run 

228980585 


Adipose 


5.4 


Renal ca.TK-10 


11.7 


Melanoma* Hs688(A).T 


10.7 


Bladder 


12.2 
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jvieianoma^ iisooo^Joy.i 


5 R 
j.o 


Gastricca Bftirf)W5ft^^' 


J.O 


lYieianoma^ jyxi4 




f^ocrnV ra T^ATO TIT 

Vjriii> LI It* Ca. JVrt-X W JXL 


1ft 6 


A/TAlan/vma* T OVnV/TVT 
jyicJanunid ukjtuivx v 1 


17 3 






A/IolonATno^ CI?' \4ThT ^ 

iviei an onia ojk-iyjldi *-d 




V^UlUIl Ca. O Yr tow 


16 6 


OqUamOUS Cell CctrClilOIlla ola^.-** 


d 9 


PnTnn * fSWdSO meO SW670 


10 R 

lv.O 


X es US JrOOl 


0 9 


Pnlnn rn "HT90 

V_-UlvJlI va. All JL,y 


17 0 


XTOSiaie ca»^ ^oone roex; r^>o 


dR n 

HO.U 


CcAcsxx r« T-TPT-1 16 
V^OIt)jl va» XjIV>>X 1AU 


vJ. / 


Prostate Pool 


u.o 


Prklrk-n ra ^or^n-7 


Q R 


Placenta - 


ft ^ 

U.J 


v^oion cancer ussue 


7 1 
/.X 


uterus JrOOl 


9 3 

z. j 


L^UlUn Ca. O VY X X XO 




uvanan ca. u v lak-j 


j.j 


PrklnTi r«Q /°nTn_9ftS 
^(Jlun Ca. v,UlU-ZV/J 


J*J 


uvanan ca. oiv-uv-j 


11 o 
1 X.O 


Loion ca. o w -*f o 


J. 7 


uvanan ca. uvlak-4 


7 0 


LOlOn XrOOJ 


ft R 


u van an ca. uvlako 


17 A 


amau luxesiine tr 001 


1 9 

x.z 


uvanan ca. xukuv-i 


o. / 


otornacn r 001 


9 9 


Uvanan ca. UVLAK-o 


8 7 
o.Z 


jDone ivi arrow jtooi 


1 9 
x.z 


Ovary 


ft 1 
U.J 


Fecaj riean 


13 ft 

X J.U 


Breast ca. Mtr- / 


A 1 
4.J 


rieart v 001 


d. ft 


Breast ca. MPA-MB-231 


25.0 


Lymph Node Pool 


ft Q 


Breast ca. Bl 54y 


Zx.j 


reiax oKeietai jvxuscie 


ft £ 
U.O 


Breast ca, i4/JLJ 


O 7 
Z. / 


oKeieiai jviuscie Jrooi 


1 7 
I. / 


Breast ca. MDA-N 


17.2 


opieen r 001 


7 ^ 
/.J 


Breast rool 


A 7 


x nyrnus r 001 


1 1 A 
XX.O 


Trachea 


^1 ft 

21. y 


LJNo cancer (giio/astro^ Uo/-ivnjr 


4o.J 


Lung 


1 o 

L2 


lino cancer (gno/astroj u-iio-jvio 


71 7 


Fetal Lung 


A ft 


l<xno cancer ^neuro^niei^ oiv-in-ao 


7 9 


T -.o, MPT XT/1 1*7 

Lung ca. INLx-lN4i/ 




ljno cancer ^asTxo^ ojrojy 


1A £ 

xo.o 


Lung ca. LX-1 




v-jNi> cancer ^astro^ oinx>.- / j 


94 7 


Lung ca. lNLl-rix4o 


J.J 


i^Pio cancer i.g-uo,; oiNx>-xy 


lift 
XX.U 


T OTTO T7 

Lung ca. oxir-/ / 


17 7 
X /./ 


v^JNo cancer vgno^ or , -zyj 


97 ^ 
Z/.J 


Lung ca. A549 


£ 0 


X? r*o in ( A T**»tf nrrlol i a\ \^ r\r>\ 

DidiD. ^/vnxyguHXay JT UU1 


9ft 

CAM 


Lung ca. jnli-x-ijZo 


110 


Drain ^cereueuunij 


* 9 l 
J.Z 


Lung ca. inl,x-jizj 


d 7 


Drain vxetai y 


1 ft 


Lung ca. inl*i-h4uU 


37 3 ! 


D I alii V,JtXippULariipUd/ lUUl 


9 0 


Lung ca. nur-oz 


Q 7 


v^cxcDXax v-oricA ruui 


1 Q 


T im/T /»o 1MPT £T^79 

Lung ca. fsv^i-xijzz 




Diain ^OUQSlanUa ulgla^ J; UUI 


1 6 

x.u 


Liver 


ft d 


T5t*oiri t' i S Ti old TYl 1 1 c "\ Pi"»n1 

Drain ^ x Jiaiainus j rooi 


1 7 

x» / 


Jreiai nver 




DXaill yvYllUlCJ 


30 


Liver ca. HepG2 


15.4 


Spinal Cord Pool 


1.0 


Kidney Pool 


1.6 


Adrenal Gland 


14.9 


Fetal Kidney 


2.2 


Pituitary gland Pool 


0.4 


Renal ca. 786-0 


10.5 


Salivary Gland 


6.1 


Renal ca. A498 


0.2 


Thyroid (female) 


0.5 


Renal ca. ACHN 


8.4 


Pancreatic ca. CAPAN2 


2.6 
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Table ATE. General screening panel vl.6 



Tissue Name 


ReL 

Exp.(%) 
Agl778, 
Run 

277218713 


Rel. 

Exp.(%) 
Ag5110, 
Run 

277218715 


issue Name 


ReL 

Exp.(%) 
Agl778, 
Run 

277218713 


Rel. 

Exp.(%) 
AgSllO, 
Run 

277218715 


Adipose 


8.8 


8.7 


Renal ca.TK-10 


31.6 


13.5 


Melanoma* 
Hs688(A).T 


45.1 


15.5 


Bladder 


23.3 


14.5 


Melanoma* 
Hso88(B).T 


34.6 


11.7 


Gastric ca. (liver 
met.) NCI-N87 


22.1 


5.0 


Melanoma* M14 


29.3 


11.6 


Gastnc ca. KATO 

in 


9.0 


15.3 


Melanoma* 
LOXIMVI 


16.6 


32.1 


Colon ca. SW-948 


9.2 


4.4 


Melanoma* 
SK-MEL-5 


23.0 


36.9 


Colon ca. SW480 


35.8 


22.5 


Squamous cell 
carcinoma SCC-4 


16.6 


7.2 


Colon ca* (SW480 
meO SW620 


24.0 


11.9 


Testis Pool 


8.9 


8.5 


Colon ca. HT29 


32.1 


21.5 


Prostate ca.* (bone 
met)P03 


100.0 


50.7 


Colon ca.HCT-1 16 


17.9 


9.3 


Prostate Pool 


J J 


1./ 


Colon ca. CaCo-2 


21.6 


13.5 


Placenta 


l.o 


U.3 


Colon cancer tissue 


7 O 

3.2 


10.5 


Uterus Pool 


1 < 
3.J 


i 1 
3.1 


_^ OTS71 1 1 a 

Colon ca. MVlllo 


3.5 


2.7 


Ovarian ca. 

KJ V LrAlv-O 


11.6 


9.5 


Colon ca. Colo-205 


6.7 


4.5 


Ovarian ca. SK-OV-3 


33.0 


20.3 


Colon ca. SW-48 


12.1 


5.2 


Ovarian ca 
OVCAR-4 


11.4 


10.7 


Colon Pool 


6.6 


1.8 


Ovarian ca. 
OVCAR-5 


28.1 


24.8 


Small Intestine Pool 


9.0 


3.0 


Ovarian ca. 
IGROV-1 


29.1 


12.7 : 


Stomach Pool 


5.6 


4.5 


Ovarian ca 
OVCAR-8 


15.9 


0.1 


Bone Marrow Pool 


5.1 


2.4 


Ovary 


4.4 


1.6 


Fetal Heart 


61.6 


26.4 


Breast ca.MCF-7 


5.9 


3.6 


Heart Pool 


6.8 


8.8 


Breast ca. 
MDA-MB-231 


79.0 


34.4 


Lymph Node Pool 


10.4 


0.8 


Breast ca. BT 549 


35.6 


15.9 


Fetal Skeletal 
Muscle 


5.6 


3.6 
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Breast ca. T47D 


3.0 


3.4 


Skeletal Muscfe 11 ■ 
Pool 


• liJS Oriel! 
0.9 


..tJt „1L ^ , 

0.7 


breast ca. MUA-JN 


OA 1 

JXjJ 


on q 


Spleen Pool 




13.0 


Breast Pool 


1A 


1.9 


Thymus Pool 


20.2 


12.6 


Trachea 


23.8 


33.7 


CNS cancer 

(glio/astro) 

U87-MG 


47.0 


51.1 


Lung 


4.6 


1.0 . 


CNS cancer 

(glio/astro) 

U-118-MG 


43.2 


100.0 


Fetal Lung 


17.4 


O 1 

8.1 


CNS cancer 
(neuro;met) 
SK-N-AS 


t A 1 

14.1 


7.7 


Lung ca. NCI-N417 


16.2 


16.0 


CNS cancer (astro) 
SF-539 


35.1 


28.3 


Lung ca. LX-1 


38.7 


8.8 


CNS cancer (astro) 


50.3 


30.8 


Lungca. NCI-H146 


16.7 


5.9 


SNB-19 


34.4 


______ 

13.1 


Lung ca. SHP-77 


53.2 


25.9 


CNS cancer (glio) 


93.3 


46.0 


Lung ca. A549 


10.9 


9.9 


Brain (Amygdala) 

Pool 


7.7 


2.3 


Lungca.NCI-H526 10.1 


10.9 


Brain (cerebellum) 


24.7 


5.3 


Lung ca. NCI-H23 


12.2 


9.2 


Brain (fetal) 


9.7 


1.3 


Lung ca. NCI-H460 


57.4 


57.8 


Brain 

(Hippocampus) 

Pnnl 


9.7 


2.8 


Lung ca. HOP-62 


39.0 


9.7 


Pool 


9.6 


3.3 


Lung ca. NCI-H522 


19.5 


13.3 


Brain (Substantia 
nigra) Pool 


6.0 


2.8 


Liver 


1.5 


06 

\J.\J 


Brain (Thalamus) 
Pool 




1 0 


Fetal Liver 


15.1 


6.0 


Brain (whole) 


9.5 


3.3 


Liver ca. HepG2 


41.5 


18.2 


Spinal Cord Pool 


5.8 


2.1 


Kidney Pool 


9.6 


2.0 


Adrenal Gland 


27.5 


23.3 


Fetal Kidney 


14.7 


2.6 


Pituitary gland Pool 


2.5 


1:0 


Renal ca. 786-0 j 


14.5 


11.0 


Salivary Gland 


9.8 


10.4 


Renal ca. A498 


2.2 


0.9 


Thyroid (female) 


1.5 


1.9 


Renal ca. ACHN 


9.5 


10.8 


Pancreatic ca. 
CAPAN2 


9.7 


5.3 


Renal ca.UO-31 


13.4 |4.6 


Pancreas Pool 


18.0 |7.2 



B 
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Table ATF. Panel 1.3D 



Tissue Name 


ReL 
Exp.(% 
Agl778, 
Kun 

157790405 


Tissue Name 


ReL 

Exp.(%) 
Agl778, 
Kun 

157790405 


Liver adenocarcinoma 


6/7 


Kidney (fetal) 


12.1 


Pancreas 


1.3 


Renal ca. 786-0 


6.8 


Pancreatic ca CAPAN 2 


2.1 


Renal ca. A498 


12.2 


Adrenal gland 


18.7 


Renal ca. RXF 393 


15.0 


Thvroid 


2.9 


Renal ca ACHN 


3.2 


Salivarv pland 


6.2 


Renal ca. UO-31 


8.4 


Piriiitarv dand 


5 7 


Renal ca TK-10 


3.6 


Brain ^fetaH 

A-*l UiU IXuUIW 


2.5 


Liver 


3.0 


R rain /wholes 


4.8 


Liver ( fetal} 


14.7 




6.3 


Liver ca fhenatoblafift I-Ter)G2 


25.5 




5.4 


T/IITIP 
JUUIlg 


13.7 




22.8 




5.3 


Rr^in fQiihstaTitia niPTa^ 

xJlcLXlx \0 ULfo IdXl H <X lll^iaj 


i 1 

X. JL 


T .line* ca ^ small celH LX^-1 


14.5 




3 3 


Lunff ca (small celH NCI-H69 


49 


OprpVwal frvrfpir 

L/W \j ULCLL V^Ul IGA. 


14.7 


Lunff ca f s cell var i SFTP-77 


36.1 


Srnnal cord * 


2.3 


T avx\p ca flange cel1YNCI-H460 


12.9 


<rlio/a<;tro TTR7-MCi 


21 6 


Lnnora fnon-sm eel H A 549 


8.1 


plio/astro II- 1 1 8-MG 


56 3 


Limffca /rion-s cell) NCI-H23 


7.3 


astrocytoma SW1783 


31.2 


Lunff ca Tnon-s cell'i HOP-62 


12.8 


neiiro** met SK-N-AS 


30.4 


Lurii? ca Cnon-s cl) NCI-H522 


4.5 


astrocvtoma SF-539 


22.2 


Lime ca Csauam.) S W 900 


1.5 


astrocvtoma SNB-75 


12.6 


t imp ca. Csauam.) NCI-H596 


0.7 


glioma SNB-19 


29.9 


Majrimary gland 


9.7 


glioma U251 


22.2 


Breast ca.* (plef) MCF-7 


4.6 


glioma SF-295 


20.3 


Breast ca.* (pl.ef) MDA-MB-23 1 


100.0 


Heart ( fetal) 


35.4 


Breast ca.* (plef) T47D 


5.1 


Heart 


4.5 


Breast ca. BT-549 


45.1 


Skeletal muscle (fetal) 


26.1 


Breast ca. MDA-N 


28.9 


Skeletal muscle 


3.1 


Ovary 


4.0 


Bone marrow 


13.1 


Ovarian ca. OVCAR-3 


4.5 


Thymus 


6.2 


Ovarian ca. OVCAR-4 


3.5 


Spleen 


15.5 


Ovarian ca. OVCAR-5 


13.4 


Lymph node 


16.3 


Ovarian ca. OVCAR-8 


3.1 


Colorectal 


7.9 


Ovarian ca. IGROV-1 


4.2 i 


Stomach 


14.5 


Ovarian ca* (ascites) SK-OV-3 


13.2 


Small intestine 


15.5 


Uterus 


3.1 


Colon ca. SW480 


9.7 


Placenta 


4.3 
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Colon ca.* bWoj£U(ovv4oU met; 


^ (rotate PCT^USDB: 


>2 - 


Colon ca. Jtiizy 


25.5 (Prostate ca.* (bone met)PC-3 


10. / 


Colon ca. hici-lio 


5.1 jTestis 






8.1 


Melanoma Hs688(A).T 


7.1 


Colon ca. tissue(OD03866) 


8.4 


Melanoma* (met) Hs688(B).T 


3.8 


Colon ca. HCC-2998 


12.2 


Melanoma UACC-62 


2.0 


Gastric ca.* (liver met) NCI-N87 


11.1 


Melanoma M14 


11.4 . 


Bladder 


8.0 


Melanoma LOX MVI 


10.8 


Trachea 


17.7 


Melanoma* (met) SK-MEL-5 


5.2 


Kidney 


0.7 


Adipose 


4.9 



Table ATG. Panel 4.1D 



Tissue Name 


ReL 

Exp.(%) 
Agl778, 

Dim 

Jvun 

276596860 


Rel. 

Exp.(%) 
Agl778, 

Dam 

276686878 


Rel. 

Exp.(%) 
Ag5110, 

XV.UD 

226444095 


Rel. 

Exp.(%) 
AgSllO, 
xvun 

276596862 


Rel. 

Exp.(%) 
Ag5110, 

Pun 

xvun 

276686880 


Secondary Thl act 


23.5 


26.8 


13.9 


14.9 


9.0 


Secondary Th2 act 


28.7 


28.1 


11.4 


14.8 


17.9 


Secondary Trl act 


5.4 


8.4 


7.9 


1.9 


4.5 


Secondary Thl rest 


2.9 


3.8 


p.i 


l.U 


L.J 


Secondary Th2 rest 


7.4 


4.3 


11.3 


4.3 


2.7 


Secondary Trl rest 


4.3 


4.9 


6.6 


4.8 


1.4 


Primary Thl act 


4.5 


5.6 


13.9 


5.0 


1.8 


Primary Th2 act 


23.2 


16.8 


14.4 


14.4 


16.5 


Primary Trl act 


22.2 


23.3 


13.9 


11.1 


12.3 


Primary Th i rest 


3.1 


3.3 


2.2 


0.0 


0.0 


Primary Th2 rest 


6.8 


4.2 


5.6 


0.0 


0.0 


Primary Trl rest 


2.6 


3.6 


10.3 


0.7 


0.0 


CD45RA CD4 
lymphocyte act 


25.5 


26.4 


9.5 


18.3 


16.2 


CD45RO CD4 
lymphocyte act 


40.1 


27.2 


22.1 


27.9 


22.4 


CD8 lymphocyte act 


5.1 


7.4 


13.1 


8.1 


24 


Secondary CD8 
lymphocyte rest 


3.3 


5.1 


20.9 


32.3 


5.1 


Secondary CD8 
lymphocyte act 


4.3 


3.7 


3.3 


1.3 


0.0 


CD4 lymphocyte none 


13.3 


8.6 


13.7 


,4.3 


4.9 


2ry 

Thl/Th2/Trl_anti-CT)95 
CHU 


3.2 


5.2 


8.1 : 


3.1 


2.4 


LAK cells rest 


13.2 


6.7 


10.1 


5.6 


4.6 
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LAK cells IL-2 |9.1 


8.0 


i K» rs.rg- 

j if IMS 11 . 


^ J^J 1 v™ll t.!> IK™ 


,.- -Jl -i|T ""!■ -t 

^ .^l"" ««p .» 


LAK cells IL-2+IL-12 


0.8 


1.3 


11.0 


1.7 


0.0 


LAK cells 1L-2+IFN 
gamma 


9.2 


8.5 


12.2 


4.8 


7.6 


LAK cells IL-2+ IL-18 


6.4 


5.1 


15.6 


3.7 


12.2 


LAK cells 
PMA/ionomycin 




100.0 


100.0 


100.0 


100.0 


NK Cells IL-2 rest 


27.5 


l/.o 


O, / 


/.I 


14. / 


Two Way MLR 3 day 


16.8 


21.2 


lo.J 


< 1 


in i 


Two Way MLR 5 day 


2.9 


2.1 


4.Z 


1-/ 


U.U 


Two Way MLR 7 day 


6.2 


2.6 


3.4 




z.o 


PBMC rest 


3.6 


3.7 


5.9 


2.3 


3.2 


PBMC PWM 


9.5 


6.9 


4.5 


1.7 


1.6 


x i_> j.vjly_x jl rin, ij 


6.9 


8.0 


8.7 


5.0 


3.4 


jxdjiiuo yxt ten/ u\ju<s 


7.7 


4.2 


4.7 


0.6 


1.4 


Pairing ceVH ionomvcin 


36.6 


32.1 


11.9 _^ 


9.2 


6.0 


x> iyiiipiiuwy lcj> jl VTivx 


11 7 

XX* i 


4.9 


6.7 


4.4 


4.3 


and IL-4 


34.2 


21.0 


13.2 


15.2 


19.8 


EOL-1 dbcAMP 


52.1 


34.4 


11.0 


10.8 


15.6 


EOL-1 dbcAMP 
PMMonomycin 


9.8 


6.0 


3.5 


1.4 


5.8 


Dendritic cells none 


9.5 


7.7 


7.3 


5.3 


j.4 


Dendritic cells LPS 


5.6 


5.0 


6.6 


1.1 


2.0 


Dendritic cells anti-CD40 


3.6 


4.2 


7.0 


1.3 


1 K 
1.3 


Monocytes rest 


4.9 


3.1 


6.9 


1.2 


U-U 


Monocytes LPS 


11.3 


8.4 


6.8 


2.9 


U.U 


Macrophages rest 


5.7 


10.2 


5.7 


1.9 


0.0 


Macrophages LPS 


3.2 


3.0 


5.2 


0.7 


3.6 


HUVEC none 


6.0 


4.2 


1.8 


1.3 


5.2 


HT TVEC starved 


11.0 


9.5 


4.4 


5.9 


2.3 


HUVEC DL-lbeta 


11.9 


10.1 


4.9 


8.1 


9.0 


HUvbC irJN gamma 


9.2 


9.4 


5.5 


2.7 


6,5 


HUVEC TNF alpha + IFN 
gamma 


3.8 


J-O 


A 1 

4.1 


1 ^ 
J.J 


1 5? 


HUVEC TNF alpha + IL4 


2.7 


2.8 


5.5 


0.0 


0.0 


T TT Tt TT"I TT *1 "1 

HUVEC IL-11 


4.3 


53 


3.5 


3.4 


0.0 


Lung Microvascular EC 
none 


25.3 


23.3 


7.5 


6.9 


< 

o.z 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


9.2 


7.0 


7.9 


2.6 


2.2 


Microvascular Dermal EC 
none 


1.8 


2.1 


3.8 


0.0 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


2.0 


2.6 


1.9 


1.3 


0.0 
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Bronchial epithelium 
TNFalpha + DLlbeta 


8.8 


14.0 


~t f*Grf 

10.6 


3.3 


3.3 


Small airway epithelium 
none 


10.7 


3.0 


2.4 


3.4 


6.0 


Small airway epithelium 
TNFalpha + IL-lbeta 


31.9 


31.0 


21.9 


30.4 


15.8 


Coronery artery SMC rest 


25.2 


19.6 


9.1 _ 


13.3 


13.4 


Coronery artery SMC 
TNFalpha + IL-lbeta 


27.5 


19.6 


5.5 


7.8 


15.2 


Astrocytes rest 


8.2 


15.3 


2.4 


1.9 


2.8 


Astrocytes TNFalpha + 
IL-lbeta 


5.2 


2.7 


3.4 


0.0 


5.3 


KU-812 (Basophil) rest 


10.7 


8.1 


3.5 


2.0 


0 0 


KU-812 (Basophil) 
PMA/ionomycin 


37.1 


25.5 


11.6 


8.9 


5.2 


CCD1106 (Keratinocytes) 
none 


on c 


OA ft 


too 

13.2 


4.5 


6.9 


CCDL106 (Keratinocytes) 
TNFalpha + IL-lbeta 


14.1 


22.7 


17 8 


77 




Liver cirrhosis 


11.4 


8.5 


7.4 


1.4 


1.4 


NCI-H292 none 


12.9 


7.6 


7.1 


5.5 


7.5 


NCI-H292EL-4 


11.9 


12.2 


4.3 


4.8 


5.8 


NCI-H292IL-9 


16.8 


12.7 


7.0 


3.7 


11.4 


NCI-H292IL-13 


12.5 


10.0 


6.5 


4.2 


7.3 


NCI-H292 DFN gamma 


3.9 


4.1 


7.6 


2.6 


4.2 


HPAEC none 


1.7 


2.9 


2.6 


0.0 


0.0 


HPAECTNF alpha + IL-1 
beta 


1U.O 


7.2 


2.9 


2.7 


3.3 


Lung fibroblast none 


31.2 


24.1 


4.5 


8.7 


5.8 


Lung fibroblast TNF 
alpha +IL-1 beta 


24.3 


21.6 


6.6 


7.5 


11.2 


Lung fibroblast IL-4 


6.5 


1.1 


1.8 




4.0 


Lung fibroblast IL-9 


19.2 


28.3 


8.2 


67 


7 7 


Lung fibroblast EL-13 


8.2 


5.1 


2.9 


0.0 


3.6 


gamma 


15.3 


14.9 


5.5 


3.8 


12.9 


jL/erniai iiorouiasi 
CCD1070rest 


25.0 


23.3 


7.8 


4.6 


11.0 


Dermal fibroblast 
CCD1070 TNF alpha 


74.2 


♦5.1 


14.1 


23.2 


36.3 


Dermal fibroblast 
CCD1070 EL-1 beta 


23.3 


22.4 


\3 


3.9 


5.7 


Dermal fibroblast IFN 
gamma 


3.4 


3.9 : 


2.0 


19 


).0 


Dermal fibroblast IL-4 ( 


5.8 


5.2 : 


*-3 


1.6 


3.0 


Dermal Fibroblasts rest 


11.2 


7.8 : 


2.8 


1.1 


3.8 


Neutrophils TNFa+LPS < 


1.5 1 


1.6 ] 


1.6 


1.8 


).0 



2 
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Neutrophils rest 


28.9 


31.2 


121 PCT 




3 


Colon 


2.3 _j 


1.5 


2.3 


0.0 


_]2.3 




Lung 


2.0 


2.4 


3.7 


0.9 


H.6 




Thymus 


13.0 


14.6 


6.6 


0.0 


j5.1 




Kidney 


7.9 


7.5 


1.7 


1.1 


]2.8 





Table ATH. general oncology screening panel v 2.4 



Tissue Name 


Rel. 

Exp,(%) 
AgSllO, 
Run 

259939210 


Tissue Nme 


Rel. 

Exp.(%) 
AgSllO, 
Run 


Colon cancer 1 


O.D 


Dln^Jn. MAT 1 

Bladder IMA! z 


U.U 


VT AT 1 

Colon NA1 1 


j.y 


bladder IN A l 3 


n n 
U.U 


Colon cancer 2 


6.0 


DIaJJa. MAT /I 

Bladder INA1 4 


U.U 


Colon NAT 2 


14.2 


Prostate adenocarcinoma 1 




Colon cancer 3 


23.7 


Prostate adenocarcinoma 2 


U.U 


rvjnn MAT ^ 
i^oion anai 3 


A J. / 


jriuouaic <iuciiuv.aiL.iii villa J 


X.vi 


Colon malignant cancer 4 


41.5 


Prostate adenocarcinoma 4 


14.2 


Colon NAT 4 


4.2 


Prostate NAT 5 


0.9 ! 


Lung cancer 1 


7.5 


Prostate adenocarcinoma 6 


0.0 


Lung NAT 1 


0.0 


Prostate adenocarcinoma 7 


0.7 


Lung cancer 2 


28.5 


Prostate adenocarcinoma 8 


0.0 _j 


Lung NAT 2 


1.2 


Prostate adenocarcinoma 9 


3.0 


Squamous cell carcinoma 3 


42.3 


Prostate NAT 10 


0.0 


Lung NAT 3 


0.0 


Kidney cancer 1 


34.2 


Metastatic melanoma 1 


1.4 


Kidney NAT 1 


4.5 


Melanoma 2 


10.4 


Kidney cancer 2 


100.0 


Melanoma 3 


2.1 


Kidney NAT 2 


3.2 


Metastatic melanoma 4 


2.2 


Kidney cancer 3 


19.6 


Metastatic melanoma 5 


4.5 


Kidney NAT 3 


1.1 


Bladder cancer 1 


0.0 


Kidney cancer 4 


37.1 


Bladder NAT 1 


0.0 


Kidney NAT 4 


1.0 


Bladder cancer 2 


2.3 







CNS_neurodegeneration._vl,0 Summary: Agl778/Ag5 110 This panel confirms 
the expression of this gene at low levels in the brains of an independent group of 
individuals. However, no differential expression of this gene was detected between 
Alzheimer's diseased postmortem brains and those of non-demented controls in this 
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experiment. Please see Panel 1.5 for a discussion of this B^^M^^en^kt 37 3 
nervous system di sorders. 

General_screening_panel_vl.5 Summary: Ag5110 Highest expression of this 
gene is detected in fetal liver (CT=29.4). Interestingly, this gene is expressed at much 
higher levels in fetal when compared to adult liver (CT=37). This observation suggests that 
expression of this gene can be used to distinguish fetal from adult liver. Li addition, the 
relative overexpression of this gene in fetal tissue suggests that the protein product may 
enhance liver growth or development in the fetus and thus may also act in a regenerative 
capacity in the adult. Therefore, therapeutic modulation of the protein encoded by this gene 
could be useful in treatment of liver related diseases. 

Among tissues with metabolic or endocrine function, this gene is expressed at low 
levels in adipose, adrenal gland, heart, fetal liver and stomach. This gene codes for a splice 
variant of pyruvate dehydrogenase [lipoamide] kinase (PDK). Pyruvate dehydrogenase 
kinase (PDK) catalyzes phosphorylation and inactivation of the pyruvate dehydrogenase 
complex (PDC). Inactivation of PDC by increased PDK activity promotes gluconeogenesis 
by conserving three-caibon substrates. This helps maintain glucose levels during starvation, 
but is detrimental in diabetes (Huang et al., 2002, Diabetes 51(2):276-83, PMID: 
11812733). Therefore, therapeutic modulation of the activity of PKD encoded by gene may 
be useful in the treatment of endocrine/metabolically related diseases, such as obesity and 
diabetes. 

In addition, this gene is expressed at low levels in cerebellum and whole brain. 
Therefore, therapeutic modulation of this gene product may be useful in the treatment of 
neurological disorders such as Alzheimer's disease, Parkinson's disease, epilepsy, multiple 
sclerosis, schizophrenia and depression. 

Moderate to low levels of expression of this gene is also seen in cluster of cancer 
cell lines derived from pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, 
squamous cell carcinoma, melanoma and brain cancers. Thus, expression of this gene could 
be used as a marker to detect the presence of these cancers. Furthermore, therapeutic 
modulation of the expression or function of this gene may be effective in the treatment of 
pancreatic, gastric, colon, lung, liver, renal, breast, ovarian, prostate, squamous cell 
carcinoma, melanoma and brain cancers. 

General^screenin^paneLvl.6 Summary: Agl778/Ag51 10 Two experiments 
with different probe and primer sets are in good agreement. Highest expression of this gene 
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is detected in a prostate cancer PC3 and a brain cancer U-W§^G' cIMftSS 3 J- 3 7 3 
(CTs=25-29.8). Expression in this panel correlates with pattern seen in panel 1.5. Moderate 
to low levels of expression of this gene is detected in tissues with metabolic/endocrine 
functions such as pancreas, adipose, adrenal gland, heart, fetal liver and gastrointestinal 
5 tract, in brain including cerebellum, cerebral cortex, substantia nigra and the whole brain 
and also in number of cancer cell lines derived from pancreatic, gastric, colon, lung, liver, 
renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and brain cancers. 
Please see panel 1.5 for further discussion on the utility of this gene. 

Panel 1.3D Summary: Agl778 Highest expression of this gene is detected in a 

10 breast cancer cell line (CT=27.4). Expression in this panel correlates with pattern seen in 
panel 1.5. Moderate to low levels of expression of this gene is detected in tissues with 
metabolic/endocrine functions such as pancreas, adrenal gland, heart, fetal liver and 
gastrointestinal tract, in brain including cerebellum, cerebral cortex, substantia nigra and 
the whole brain and also in number of cancer cell lines derived from pancreatic, gastric, 

15 colon, lung, liver, renal, breast, ovarian, prostate, squamous cell carcinoma, melanoma and 
brain cancers. Please see panel 1.5 for further discussion of this gene. 

Panel 4.1D Summary: Agl778/Ag51 10 Five experiments with the two different 
probe-primer sets are in good agreement. Highest expression of this gene is detected in 
PMA/ionomycin treated LAK cells. These cells are involved in tumor immunology and cell . 

20 clearance of virally and bacterial infected cells as well as tumors. Therefore, modulation of 
the function of the protein encoded by this gene through the application of a small molecule 
drug or antibody may alter the functions of these cells and lead to improvement of 
symptoms associated with these conditions. 

Low levels of expression of this gene is also seen in naive and memory T cells, 

25 resting secondary CD8 lymphocytes, cytokine activated small airway epithelium, and 
resting neutrophils. Therefore, therapeutic modulation of this gene or its protein product 
may be useful in the treatment of Therefore, therapeutic modulation of this gene product 
may ameliorate symptoms/conditions associated with autoimmune and inflammatory 
disorders including psoriasis, allergy, asthma, inflammatory bowel disease, rheumatoid 

30 arthritis and osteoarthritis 

general oncology screening panel_v_2.4 Summary: Ag5110 Highest expression 
of this gene is detected in kidney cancer (CT=32). Low levels of expression of this gene is 
also seen in colon, lung, prostate and kidney cancer. Higher levels of expression of this 
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gene is associated with cancer as compared to corresponc&n^dnii^sgtfd^ere'fbfb; 
expression of this gene may be used as diagnostic marker for the detection of these cancere. 
Furthermore, therapeutic modulation of this gene or its protein product may be useful in the 
treatment of colon, lung, prostate and kidney cancers. 

AU. CG96736-01: Neutral amino acid transporter B. 

Expression of gene CG96736-01 was assessed using the primer-probe sets Ag3788 
and Ag4075, described in Tables AUA and AUB. Results of the RTQ-PCR runs are shown 
in Tables AUC, AUD, AUE, AUF, AUG, AUH, AUI, AUJ and AUK. 

Table AUA. Probe Name Ag37ft8 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 1 -cgagaaatatcttcccttccaa-3 1 


22 


1182 


432 


Probe 


TET-5 ' -tgtcagcagcctttcgctcatactct 
-3 ' -TAMRA 


26 


1209 


433 


Reverse 


5 ■ -ttccggtgatattcctctcttc-3 * 


22 


1244 


434 



Table AUB. Probe Name Ag4075 



Primers 




Length 


Start 
Position 


SEQID 
No 


Forward 


5 ' -cgagaaatatcttcccttccaa-3 • 


22 


1182 


435 


Probe 


TET-5 ' -tgtcagcagcctttcgctcatactct 
-3 ' -TAMRA ! 


26 


1209 


436 


Reverse 


5 1 -ttccggtgatattcctctcttc-3 * 


22 


1244 


437 



Table AUC, AI comprehensive panel vl.Q 



Tissue Name 


ReL 

Exp.(%) 
Ag4075, 
Run 

226203371 


issue Name 


Rel. 

Exp.(%) 
Ag4075, 
Run 

226203371 


1 10967 COPD-F 


6.0 


112427 Match Control Psoriasis-F 


12.3 


1 10980 COPD-F 


9.9 


112418 Psoriasis-M 


3.6 


110968 COPD-M 


6.6 


1 12723 Match Control Psoriasis-M 


6.3 


110977 COPD-M 


0.0 1112419 Psoriasis-M 


6.5 
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110989 Eniphysema-F 


8.7 


1 12424 Match fcoUol P^alf^ 




1 10992 Emphysema-F 


12.3 


1 12420 Psoriasis-M 


14.1 


110993 Emphysema-F 


7.2 


112425 Match Control Psoriasis-M 


6.7 


110994 Emphysema-F . 


4.6 


104689 (MF) OA Bone-Backus 


21.6 


110995 Emphysema-F 


20.3 


104690 (MF) Adj "Normal" 
Bone-Backus 


21.8 


110996 Emphysema-F 


7.1 


104691 (MF) OA Synovium-Backus 


14.1 


110997 Asthma-M 


2.5 


104692 (BA) OA Cartilage-Backus 


53.6 


111001 Asthma-F 


6.7 


104694 (BA) OA Bone-Backus 


14.8 


111002 Asthma-F 


5.7 


104695 (BA) Adj "Normal" 
Bone-Backus 


28.7 


111003 Atopic Asthma-F 


11.0 


104696 (BA) OA Synovium-Backus 


15.8 


111004 Atopic Asthma-F 


13.3 


104700 (SS) OA Bone-Backus 


11.6 


111fMV> Atrmif AotTirno V* 


19 9 


104701 (SS) Adj "Normal" 
Bone-Backus 


ii. 1 


1 1 1006 Atopic Asthma-F 


2.6 


104702 (SS) OA Synovium-Backus 


27.5 


111417 Allergy-M 


7.6 


1 17093 OA Cartilage Rep7 


6.3 


1 12347 Allergy-M 


0.0 


112672 OA Bone5 


6.0 


1 12349 Normal Lung-F 


0.0 


112673 OA Synovium5 


1.4 


1 12357 Normal Lung-F 


19.9 


1 12674 OA Synovial Fluid cells5 


3.0 


112354 Normal Lung-M 


4.0 


1 17100 OA Cartilage Repl4 


4.0 


1 12374 Crohns-F 


2.7 


112756 OA Bone9 . 


100.0 


112389 Match Control Crohns-F 


9.3 


112757 OA Synovium9 


0.9 


112375 Crohns-F 


2.0 


1 12758 OA Synovial Fluid Cells9 


3.8 


112732 Match Control Crohns-F 


12.6 


1 17125 RA Cartilage Rep2 


9.0 


112725 Crohns-M 


0.3 


1 13492 Bone2RA 


8.1 


1 1?387 Matrh Pnntml 
xL&joi xvjLaii.ii wuiiuui 

Crohns-M 


5.0 


113493 Synovium2RA 


2.5 


112378 Crohns-M 


0.0 


1 13494 Syn Fluid Cells RA 


5.3 


1 1 93QO A/Tntrh Pnntrnl 

lYJLd.lL.Il V^UIllXUl 

Crohns-M 


6.0 


1 13499 Cartilage4RA 


6.7 


112726 Crohns-M 


9.9 


1 13500 Bone4RA 


7.0 


1 12731 Match PnntTnl 

Crohns-M 


8.1 


113501 Synovium4RA 


4.4 


112380 Ulcer Col-F 


6.0 


1 13502 Syn Fluid Cells4 RA 


3.2 


1 1 2734 Match Pr>ntrnl T TTrpr 

Col-F 


21.0 


113495 Cartilage3RA 


6.3 


112384 Ulcer Col-F j 


14.1 


1 13496 Bone3RA 


8.4 


1 12737 Match Control Ulcer 
Col-F 


3.4 


113497 Synovium3RA 


5:1 


112386 Ulcer Col-F 


3.4 j 


113498 Syn Fluid Cells3RA 


7.9 


112738 Match Control Ulcer 
Col-F 


18.0 


1 17i06 Normal Cartilage Rep20 


5.7 


112381 Ulcer Col-M 


3.0 


113663 Bone3 Normal ( 


).0 


1 12735 Match Control Ulcer 
Col-M 1 


).5 


1 1 3664 Synovium3 Normal ( 


).0 
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112382 Ulcer Col-M 


7.1 


113665 Syn Fifed 




Q Q^f _JL 3 . 




112394 Match Control Ulcer 
Col-M 


1.6 


1 17107 Normal Cartilage Rep22 


1.8 


112383 Ulcer Col-M 


13.1 


113667 Bone4 Normal 


2.4 


112736 Match Control Ulcer 
Col-M 


3.8 


113668 Synovium4 Normal 


1.7 


112423 Psoriasis-F 


6.3 


113669 Syn Fluid Cells4 Normal 


3.9 



Table AUP. CNS neurodegeneration vl.O 



Tissue Name 


Rel. 

Exp.(%) 
Ag4075, 
tvun 

214294982 


issue Name 


Rel. 

Exp.(%) 
Ag4075, 
ixun 

214294982 


AD 1 Hippo 


11.0 


Control (Path) 3 Temporal Ctx 


1.0 


AD 2 Hippo 


8.4 


Control (Path) 4 Temporal Ctx 


1.7 


AD 3 Hippo 


8.0 


AD 1 Occipital Ctx 


6.5 


AD 4 Hippo 


2.9 


AD 2 Occipital Ctx (Missing) 


0.0 


AD 5 Hippo 


16.8 


AD 3 Occipital Ctx 


1.3 


AD 6 Hippo 


100.0 


AD 4 Occipital Ctx 


3.6 


Control 2 Hippo 


19.6 


AD 5 Occipital Ctx 


11.9 


Control 4 Hippo 


17.6 


AD 6 Occipital Ctx 


6.5 


Control (Path) 3 Hippo 


3.0 


Control 1 Occipital Ctx 


5.6 


AD 1 Temporal Ctx 


6.3 


Control 2 Occipital Ctx 


10.4 


AD 2 Temporal Ctx 


14.1 


Control 3 Occipital Ctx 


6.0 


AD 3 Temporal Ctx 


4.2 


Control 4 Occipital Ctx 


2.9 


AD 4 Temporal Ctx 


7.5 


Control (Path) 1 Occipital Ctx 


3.3 


AD 5 M Temporal Ctx 


8.9 


Control (Path) 2 Occipital Ctx 


0.5 


AD 5 Sup Temporal Ctx 


24.5 


Control (Path) 3 Occipital Ctx 


1.6 


AD 6 Inf Temporal Ctx 


78.5 


Control (Path) 4 Occipital Ctx 


0.4 


AD 6 Sup Temporal Ctx 


56.6 


Control 1 Parietal Ctx 


5.9 


Control 1 Temporal Ctx 


2.3 


Control 2 Parietal Ctx 


9.9 


Control 2 Temporal Ctx 


12.1 


Control 3 Parietal Ctx 


6.0 


Control 3 Temporal Ctx 


7.7 


Control (Path) 1 Parietal Ctx 


3.6 


Control 3 Temporal Ctx 


3.1 


Control (Path) 2 Parietal Ctx 


1.1 


Control (Path) 1 Temporal Ctx 


4.6 


Control (Path) 3 Parietal Ctx 


2.2 


Control (Path) 2 Temporal Ctx 


L8 


Control (Path) 4 Parietal Ctx 


3.4 
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Table AUE, General screening panel vl.4 



Tissue Name 


Rel. 

Vim (<%n\ 

Ag4075, 
Run 

212696066 


Rel. 

Ag407S, 
Run 

218525356 


issue Name 


Rel. 

£xp.(%) 
Ag4075, 

Run 

212696066 


Rel. 

Exp.(%) 
Ag4075, 

Pun 

218525356 


Adipose 


0.0 


1.3 


Renal ca TK-10 


97 


14 R 


Melanoma* 
Hs688(A).T 


14.4 


23.2 


Bladder 


1.0 


1.8 


Melanoma* 
Hs688(B).T 


19.1 


29.9 


fr;i ctrif* pa nivpi* 

met.) NQ-N87 


41.5 


— — - 

42.0 


Melanoma* M14 


9.5 


12.7 


HI 


25.5 


22.8 


Melanoma* 
LOXMVI 


O.l 


12.9 


Colon ca. SW-948 


4.4 


5.6 


Melanoma* 
SK-MEL-5 


5.9 


14.2 


Colon ca. SW480 


100.0 


100.0 


Squamous cell 
carcinoma SCC-4 


5.1 


102 


Colon ca * (SW480 
met) SW620 


ill ^ 

*f X.J 


jU.u 


Testis Pool 


1.4 


1.9 


Colon ca. HT29 


10.2 


13.6 


Prostate ca.* (bone 
met)PC-3 


9.5 


13.6 


Colon ca. HCT-116 


13.0 


20.9 


Prostate Pool 


1.1 


1.5 


Colon ca. CaCo-2 


12.0 


14.5 


Placenta 


1.1 


1.3 


Colon cancer tissue 


5.0 


8.4 


Uterus Pool 


0.1 


0.2 


Colon ca.SWl 116 


14.7 


15.9 


Ovarian ca. 
OVCAR-3 


6.5 


8.0 


Colon ca. Colo-205 


24.7 


29.5 


Ovarian ca. SK-OV-3 


8.1 . 


9.9 


Colon ca. SW-48 


3,6 


4.7 


Ovarian ca. 
OVCAR-4 


9.2 


16.4 


Colon Pool 


0.7 


1.1 


Ovarian ca. 
OVCAR-5 


28.1 


32.1 


Small Intestine Pool 


0.5 


0.6 


Ovarian ca. 
IGROV-1 


23.0 


33.2 


Stomach Pool 


0.8 


0.8 


Ovarian ca. 
OVCAR-8 




1 £. A 

16.4 


Bone MaiTow Pool 


0.2 


0.4 


Ovary 


0.5 


0.8 


Fetal Heart 


0.1 


0.1 


Breast ca. MCF-7 


15.7 


17.2 


Heart Pool 


0.2 


3.3 


Breast ca. 
MDA-MB-231 


10.4 


15.6 


Lymph Node Pool 


1.2 


1.0 


Breast ca. BT 549 


?.9 


18.7 


Fetal Skeletal 
Vluscle 


).2 


3.2 


Breast ca. T47D i 


>3.2 f 


51.8 j 


Skeletal Muscle ( 
y oo\ 1 


).2 ( 


).3 


Breast ca. MDA-N ^ 


k7 ( 


5-3 i 


Spleen Pool ( 


).7 ( 


).5 


Breast Pool ( 


).6 ( 


).6 1 


ttiymus Pool C 


).8 


).9 
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1 racnea 


3.6 


5.3 


CNS cancer 11 " 1 ^ 3 

(glio/astro) 

U87-MG 


a 7 rt ys-cirt5 

20.0 


20.3 


Lung 


0.1 


0.1 


CNS cancer 

(glio/astro) 
tt 1 ir ivm 

U-i, i<WVJL\J 


11.2 


12.9 


Petal Limp 

A VLU.J JJUlJtl 


2.4 


4 0 


CNS cancer 
^neiffo,mei^ 
SK-N-AS 




o r\ 

8.9 


Lung ca.NCI-N417 


1.6 


0.0 


CNS cancer (astro) 
SF-539 


9.3 


12.0 


Lung ca, LX-1 


81.8 


82.4 


ljno cancer (astro j 
SNB-75 


36.1 


55.5 


Lung ca. NCI-H146 


0.4 


0.8 


CNS cancer (riio^ 
SNB-19 


30.1 


37.6 


Lungca.SHP-77 


6.8 


8.5 


CNS cancer (glio) 
SF-295 


58.6 


60.7 


Lung ca. A549 


9.8 


15.8 


Brain (Amygdala) 
Pool 


0.0 


0.1 


Lungca.NCI-H526 


2.1 


2.5 


Brain (cerebellum) 


0.1 


0.2 




4 ^ 




13 fain f-Fe*+nt\ 

±>ram (tetal) 


0.2 


0.3 








Brain 

(Hippocampus) 
Pool 


0.1 


0.1 


Lungca.HOP-^62 


4.4 


4.5 


Cerebral Cortex 
Pool 


0.0 




0.1 






inn 


Brain (Substantia 
nigra) Pool 


0.1 


0.1 


Liver 


0.0 


0.1 


Brain (Thalamus) 
Pool 


00 




Fetal Liver 


2.9 


4,3 


Brain (whole) 


0.2 


5.2 


Liver ca. HepG2 


5.7 


7.9 


Spinal Cord Pool 


D.2 


).3 


Kidney Pool 


1.1 


1.2 


Adrenal Gland 


).3 


).6 


Fetal Kidney 


3.3 


).5 


Pituitary gland Pool ( 


).l 


).3 


Renal ca. 786-0 


5.1 


)5 


Salivarydand : 


j.o : 


IS j 


Renal ca.A498 


U ! 


5.0 


rhyroid (female) ( 


).i ( 


).l 


Renal ca. ACHN f 




» ; 


?ancreatic ca. 

:apan2 


r.9 i 


12.2 


Renal ca.UO-31 2 


J.6 4 


k2 I 


'ancreas Pool J 


.3 ] 


.2 
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Table AUF. General screening panel vl.5 ' 



Tissue Name 


ReL 

Exp.(%) 
Ag4075, 
Run 

228714883 


- 

issue Name 


ReL 

Exp.(%) 
Ag4075, 
Run 

228714883 


Adipose 


1.0 


Renal ca. TK-10 


9.8 


Melanoma* Hs688(A).T 


18.0 


Bladder 


1.4 


Melanoma* Hs688(B).T 


17.4 


Gastric ca. (liver met.) NCI-N87 


35.4 


Melanoma* M 14 


9.5 


Gastric ca. KATO in 


19.9 


Melanoma* LOXIMVI 


9.0 


Colon ca. SW-948 


4.4 


Melanoma* SK-MEL-5 


8.7 


Colon ca. SW480 


100.0 


Squamous cell carcinoma SCG4 


5.8 


Colon ca.* (SW480 met) SW620 


32.8 


Testis Pool 


1.2 


Colon ca. HT29 


9.9 


Prostate ca.* (bone met) PC-3 


10.8 


Colon ca.HCT-1 16 


15.2 


Prostate Pool 


1.5 


Colon ca. CaCo-2 


11.1 


Placenta 


1.1 


Colon cancer tissue 


5.1 


Uterus Pool 


0.3 


Colon ca. SW1 116 


7.2 


Ovarian ca. OVCAR-3 


6.2 


Colon ca. Colo-205 


23.7 


Ovarian ca. SK-OV-3 


7.5 


Colon ca. SW-48 


3.2 


Ovarian ca.OVCAR-4 


12.5 


Colon Pool 


0.7 


Ovarian ca. OVCAR-5 


20.2 


Small Intestine Pool 


0.4 


Ovarian ca.IGROV-1 


23.8 


Stomach Pool 


0.7 


Ovarian ca. OVCAR-8 


11.2 


Bone Marrow Pool 


0.2 


Ovary 


0.6 


Fetal Heart 


0.1 


Breast ca. MCF-7 


14.4 


Heart Pool 


0.2 


Breast ca. MDA-MB-231 


14.1 


Lymph Node Pool 


0/7 


Breast ca. BT 549 


8.4 


Fetal Skeletal Muscle 


0.2 


Breast ca.T47D 


2.1 


Skeletal Muscle Pool 


0.4 


Breast ca. MDA-N 


3.6 


Spleen Pool 


0,3 


Breast Pool 


0.5 


Thymus Pool 


0.5 


Trachea 


4.6 


CNS cancer (gho/astro) U87-MG 


12.5 


Lung 


0.1 


CNS cancer (gho/astro) U-l 18-MG 


8.5 


Fetal Lung 


2.6 


CNS cancer (neuro;met) oK-N-Ao 


5.5 


Lungca. NCI-N417 


1.9 


CNS cancer (astro) SF-539 


8.4 


Lung ca.LX-1 


81.8 


CNS cancer (astro) SNB-75 


13.1 


Lungca.NCI-H146 


0.6 


CNS cancer (glio) SNB-19 


27.2 


Lung ca. SHP-77 


7.7 


CNS cancer (glio) SF-295 


53.2 


Lung ca. A549 


11.8 


Brain (Amygdala) Pool 


0.0 


Lungca. NCI-H526 


2.1 


Brain (cerebellum) 


0.1 


Lung ca. NCI-H23 


3.5 


Brain (fetal) 


0.2 


Lungca.NCI-H460 


8.8 


Brain (Hippocampus) Pool 


0.0 


Lung ca. HOP-62 


3.5 


Cerebral Cortex Pool 


0.1 
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Lungca.NCI-H522 



7.5 



Brain (Substantia nigra) Fool a 



0.T 



Liver 



0.0 



Brain (Thalamus) Pool 



0.1 



Fetal Liver 



2.9 



Brain (whole) 



0.2 



Liver ca. HepG2 



6.2 



Spinal Cord Pool 



0.1 



Kidney Pool 



0.8 



Adrenal Gland 



0.4 



Fetal Kidney 



Renal ca. 786-0 



03 
5.6 



Pituitary gland Pool 



Salivary Gland 



0.2 



2.7 



Renal ca. A498 



3.4 



Thyroid (female) 



0.1 



Renal ca. ACHN 



4.9 



Pancreatic ca. CAPAN2 



9.7 



Renal ca. UO-31 



2.4 



Pancreas Pool 



0.8 



Table AUG. Panel 3D 



5 



Tissue Name 


Rel. 

Exp.() 
Ag4075, 

Pun 

186579982 


Tissue Name 


Rel. 

Exp.(%) 
Ag4075, 
ixun 

186579982 


Daoy- Medulloblastoma 


1.7 


v^a oki- v^ervicar epiuei xiioiu 
carcinoma (metastasis) 


9.3 


TE671- Medulloblastoma 


1.3 


Ovarian clear cf*11 carcinoma 


4.2 


D283 Med- Medulloblastoma 


13.6 


Ramos- Stimulated with 
PMA/ionomycin 6h 


12.2 


PFSK-1- Primitive 
Neuroectodermal 


8.0 


Ramos- Stimulated with 
PMA/ionomycin 14h 


12.2 


XF-498-CNS 


5.1 


MEG-01- Chronic myelogenous 
leukemia (megokaryoblast) 


25.0 


SNB-78- Glioma 


12.9 


Raji- Burkitt's lymphoma 


2.4 


SF-268- Glioblastoma 


5.4 


Daudi- Burkitt's lymphoma 


5.0 


T98G- Glioblastoma 


7.9 


U266- B-cell plasmacytoma 


9.3 


SK-N-SH- Neuroblastoma 
(metastasis) 


4.4 


CA46- Burkitt's lymphoma 


2.6 


SF-295- Glioblastoma 


8.2 


RL- non-Hodgkin's B-cell 
lymphoma 


6.5 


Cerebellum 


0.1 


JM1- pre-B-cell lymphoma 


6.0 


Cerebellum 


0.1 


Jurkat- T cell leukemia 


7.6 


NCI-H292- Mucoepidermoid 
lung carcinoma 


12.0 


TF-1- Eiythroleukemia 


17.6 


DMS-1 14- Small cell lung cancer 


3.0 


HUT 78- T-cell lymphoma 


4.9 


DMS-79- Small cell lung cancer 


92.0 


U937- Histiocytic lymphoma 


17.9 


NCI-H146- Small cell lung 
cancer 


1.6 


KU-8 12- Myelogenous leukemia 


15.4 


NCI-H526- Small cell lung 
cancer 


10.7 


769-P- Clear cell renal carcinoma 


5.8 • 
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NCI-N417- Small cell lung 
cancer 


3.0 


: — pc--B-^UL:;:og/ 

Caki-2- Clear cell renal carcinoma 

— : 


* "-mi pit n — 
•^tt .*)&. jf" w . 

5.5 


NCI-H82- Small cell lung cancer 


5.7 


SW 839- Clear cell renal carcinoma 


6.2 


nvi. XI U / OUUaJUUUa ucjj lung 

cancer (metastasis) 


30.1 


G401- Wilms 1 tumor 


3.8 


NCI-HI 1 SS- T ar^e rell Inner 

cancer 


9.5 


VXc r 7&£T'- Pan r*iv>» fir* rnrrinnma /T M 
no / uu i r oliti C/duv i/cuuiiiuiiia ^JL#l^l 

metastasis) 


7.6 


NCI-H1299- Large cell lung 
cancer 


6.1 


CAPAN-1- Pancreatic 
adenocarcinoma (liver metastasis) 


3.3 


NCI-H727- Lung carcinoid 


8.7 


SU86.86- Pancreatic carcinoma 
(liver metastasis) 


5.1 


NCI-UMG-1 1- Lung carcinoid 


14.4 


BxPC-3- Pancreatic 
adenocarcinoma 


11.4 


LX-1- Small cell lung cancer 


100.0 


HPAC- Pancreatic adenocarcinoma 


6.1 


Colo-205- Colon cancer 


49.3 


MIA PaCa-2- Pancreatic carcinoma 


1.1 


KM12- Colon cancer 


12.7 


CFPAC-1- Pancreatic ductal 
adenocarcinoma 


10.4 


KM20L2- Colon cancer 


11.7 


PANC-1- Pancreatic epithelioid 
ductal carcinoma 


4.3 


xnui-ji / io- colon cancer 


in o 
IU.Z 


T24- Bladder carcinma (transitional 
cell) 




SW-48- Colon adenocarcinoma 


6.7 


5637- Bladder carcinoma 


2.8 


SW1 1 16- Colon adenocarcinoma 


20.9 


HT-1 197- Bladder carcinoma 


10.4 


LS 174T- Colon adenocarcinoma 


13.4 


UM-UC-3- Bladder carcinma 
(transitional cell) 


1.4 


SW-948- Colon adenocarcinoma 


0.9 


A204- Rhabdomyosarcoma 


2.6 


SW-480- Colon adenocarcinoma 


3.5 


HT-1080- Fibrosarcoma 


4.7 


NCI-SNU-5- Gastric carcinoma 


34.6 


MG-63- Osteosarcoma 


8.1 


KATO HI- Gastric carcinoma 


38.7 


SK-LMS-1- Leiomyosarcoma 
(vulva) 


8.1 


NCI-SNU-16- Gastric carcinoma 


2.9 


SJRH30- Rhabdomyosarcoma (met 
to bone marrow) 


1-9 


NCI-SNU-1- Gastric carcinoma 


22.4 


A431- Epidermoid carcinoma 


10.6 


RF-1- Gastric adenocarcinoma 


1.8 


WM266-4- Melanoma 


5.5 


RF-48- Gastric adenocarcinoma 


1.9 


L/kJ LH- J- a lObldlc CoIClUUIJld ^(/Ialu 

metastasis) 


0.1 


MKN-45- Gastric carcinoma 


12.0 


xviJLy/\-jvixi- £ foo- urea sc 
adenocarcinoma 


13.4 


NCI-N87- Gastric carcinoma 


24.5 


SCC-4- Sauamous cell carcinoma of 
tongue 


0.2 


OVCAR-5- Ovarian carcinoma 


2.3 


SCC-9- Squamous cell carcinoma of 
tongue 


0.2 


RL95-2- Uterine carcinoma 


8.3 


SCC-15- Squamous cell carcinoma 
of tongue 


0.3 


HelaS3- Cervical 
adenocarcinoma 


2.3 


CAL 27- Squamous cell carcinoma 
of tongue 


10.7 
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Table AUH. Panel 4.1D 





Rel. 




Rel. 


Tissue Name 


Exp.(% 

Ag4075, 

Run 

184565261 


Tissue Name 


Exp.(%) 
Ag4075, 
Run 

184565261 


Secondary Thl act 


81.2 


HUVECIL-lbeta 


35.1 


Secondary Th2 act 


84.1 


HUVEC IFN gamma 


17.6 


Secondary TtI act 


67.8 


HUVEC TNF alpha + IFN gamma 


24.7 


Secondary Thl rest 


3.5 


HUVEC TNF alpha + IL4 


29.9 


Secondary Th2 rest 


11.3 


HUVEC IL-1 1 


12.4 


Secondary Trl rest 


3.6 


Lung Microvascular EC none 


33.4 


Primary Thl act 


43.2 


Lung Microvascular EC TNFalpha 
+ IL-lbeta 


21.0 


Primary Th2 act 


55.1 


Microvascular Dermal EC none 


20.3 


Primary Trl act 


51.8 


Microsvasular Dermal EC 
TNFalpha + IL-1 beta 


11.7 


rnmary in J. rest 


i ^ 
j.j 


Bronchial epithelium TNFalpha + 
ILlbeta 


J/.O 


Primary Th2 rest 


2.2 


Small airway epithelium none 


10.8 


Primary Trl rest 


10.3 


Small airway epithelium TNFalpha 
+ IL-lbeta 


15.3 


CD45RA CD4 lymphocyte act 


52.5 


Coronery artery SMC rest 


34.6 


CD45R0 CD4 lymphocyte act 


45.7 


Coronery artery SMC TNFalpha + 
EL-lbeta 


32.5 


CD8 lymphocyte act 


51.1 


Astrocytes rest 


10.9 


Secondary CDS lymphocyte rest 


41.5 


Astrocytes TNFalpha + IL-lbeta 


7.1 


Secondary CD8 lymphocyte act 


36.1 


KU-812 (Basophil) rest 


52.1 


KsL/Hr iympQouytG nuns 


VJ.U 


KU-812 (Basophil) 
PMA/ionomycin 


82.4 


2ry ThiyTh2/Trl_anti-CD95 
CH11 


4.0 


CCD 1106 (Keratinocytes)none 


52.9 


LAK cells rest • 


24.1 


CCDH06 (Keratmocytes) 

TNTFalnfia -4-TT -Ihpfa 

lJ.NJ7aJ L/llft "T* XL->~ l UCLa 


39.8 


LAK cells IL-2 


34.6 


Liver cirrhosis I 


2.8 


LAK cells lL-z+iL-12 


Zoo 


ZNui-xizyz none 


97 ft ! 


LAK cells IL-2+IFN gamma 


20.4 


NCI-H292 IL-4 


53.6 


LAK cells IL-2+IL-18 


29.5 


NCI-H292IL-9 


29.5 


LAK cells PMA/ionomycin 


49.0 | 


NCI-H292 IL-13 


51.4 


NK Cells IL-2 rest 


43.2 


NCI-H292 IFN gamma 


58.6 


Two Way MLR 3 day 


22.4 


HPAEC none 


10.4 


Two Way MLR 5 day 


39.8 


HPAEC TNF alpha + IL-1 beta 


17.0 


Two Way MLR 7 day 


25.9 


Lung fibroblast none 


42.0 


PBMC rest 


23 


Lung fibroblast TNF alpha + IL-1 
beta 


17.7 
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PBMCPWM 


|42.3 


iLungYArofeairi^ "-""^^ 




PBMC PHA-L 


|30.1 


Lung fibroblast IL-9 


38.4 


Ramos (B cell) none 


57.4 


Lung fibroblast IL-13 


41.2 


Ramos (B cell) ionomycin 


100.0 


Lung fibroblast IFN gamma 


39.5 


B lymphocytes PWM 


31.2 


Dermal fibroblast CCD 1070 rest 


84.7 


B lymphocytes CD40L and IL-4 


14.5 


Dermal fibroblast CCD 1070 TNF 
alpha 


59.0 


POT 1 Ah>r> AlV/TP 


IZ.1 1 
01.1 


Dermal fibroblast CCD 1 070 IL-1 
beta 


55.1 


EOL-1 dbcAMP 
PMA/ionomycin 


21.2 


Dermal fibroblast IFN pamma 


16 7 


Dendritic cells none 


28.5 


Dermal fibroblast IL-4 


36.9 


Dendritic cells LPS 


7.9 


Dermal Fibroblasts rest 


15.0 


Dendritic cells anti-CD40 


32.8 


Neutrophils TNFa+LPS 


1.6 


Monocytes rest 


11.0 


Neutrophils rest 


0.4 


Monocytes LPS 


5.4 


Colon 


4.5 


Macrophages rest 


25.5 


Lung 


7.5 


Macrophages LPS 


3.7 


Thymus 


6.3 


HUVEC none 


21.9 


Kidney 


12.9 


HUVEC starved 


27.7 







Table ACT. Panel 5 Islet 



Tissue Name 


Rel. 

Exp.(%) 

Ag4075 

Run 

186511155 


Tissue Name 


ReL 

Exp.(%) 
Ag4075, 
Run 

186511155 


97457JPatient-£2go_jidipose 


7.6 


94709_Donor 2 AM - A.adipose 


45.7 


97476JPatient-07sk_skeletal 
muscle 


2.9 


94710JDonor 2 AM - B_adipose 


27.4 


97477_Patient-07ut_uterus 


3.5 


94711_Donor 2 AM - C_adipose 


15.2 


97478JPatient-07pl_placenta 


5.0 


94712_Donor 2 AD - A_adipose 


62.9 


99167JBayer Patient 1 


30.6 


94713JDonor 2 AD - B_adipose 


66.4 


97482_Patient-08ut_uterus 


4.6 


94714JDonor 2 AD - C_adipose 


57.4 


97483 JParjent-08pl_placenta 


3.8 


94742J>onor 3 U - A_Mesenchymal 
Stem Cells 


36.1 


97486JPatient-Wsk_skeletal 
muscle 


0.3 


94743 JOonor 3 U - BJVIesenchymal 
Stem Cells 


62.4 


97487_Patient-09ut_uterus 


8.3 


94730_Donor 3 AM - A_adipose 


34.9 


97488_j > atient-09pLplacenta 


3.4 


94731 J)onor 3 AM - B_adipose 


17.2 


97492JPatient-10ut_uterus 


7.5 


94732_Donor 3 AM - Q_adipose 


22.4 


97493 JPatient-lOpLplacenta 


5.1 


94733_Donor 3 AD - A_adipose 


100.0 


97495_Patient-l lgo_adipose 


6.4 


94734_£onor 3 AD - B_adipose 


32.3 
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97496_Patient-l lsk_skeletal 
muscle 


1.3 


I FB-TVUQOa/ 

94735JDonor 3 AD - C_adipose 


66.9 


97497_Patient-l lut_uterus 


11.6 


77138JLiverJHepG2untreated 


31.4 


97498 J>atient-1 lpLpIacenta 


3.9 


73556_Heart_Cardiac stromal cells 
(primary) 


3.6 


97500 Patient- 12so adipose 


8.5 


81735_Small Intestine 


6.4 


97501JPatient-12sIt_skeletal 
muscle 


2.7 


72409_Kidney_Proximal Convoluted 
Tubule 


3.8 


97502 JPatient- 12ut_uterus 


8.7 


82685_Sinall intestineJDuodenum 


1.9 


97503_Patient-12pl_pIacenta 


3.1 


QOfi*>0 AHrf*Tial AdrpnorrvrKpfll 

adenoma 


1.4 


94721JDonor2U- 

A Jtfesenchymal Stem Cells 


40.1 


72410JCidney_HRCE 


14.9 


94722_Donor2U- 
B_Mesenchymal Stem Cells 


23.7 


72411_KidneyJHRE 


11.1 


94723 JDonor2U- ; 
C_Mesenchymal Stem Cells 


52.5 


73139_Uterus_Uterine smooth 
muscle cells 


17.4 



Table AU.T. Panel 5D 



Tissue Name 


Rel. 

Exp.(%) 

Ag378, 

Run 

1702226 

81 


Rel. 

Exp.(%) 

Ag4075, 

Run 

1721678 

23 


Tissue Name 


|Rel. 
Exp.(%) 
Ag3788, 
Run 

17022268 
1 


Rel. 

Exp.(%) 
Ag4075, 
Run 

17216782 
3 


97457_Patient-02go_adipos 
e 


8.2 


11.0 


94709_Donor2AM- 
A_adipose 


44.1 


53.2 


97476_Patient-07sleskeleta 
1 muscle 


2.1 


2.8 


94710JDonor2AM- 
B_adipose 


31.2 


28.3 


97477 JPatient-07ut_uterus 


3.5 


7.1 


9471 l_Donor 2 AM- 
C_adipose 


29.3 


30.8 


97478_j > atient-07pLplacent 
a 


5.1 


5.8 


94712_Donor2AD- 
A_adipose 


77.4 


81.8 


97481JPatient-08sk_skeIeta 
1 muscle 


4.2 


3.9 


94713_Donor2AD- 
B_adipose 


100.0 


100.0 


97482JPatient-08utjuterus 


5.7 


8.7 


94714_Donor2AD- 
C_adipose 


68,8 


84.1 


97483 J>atient-08pLplacent 
a 


7.5 


7.2 


94742_Donor3U- 
A_Mesenchymal Stem Cells 


55.1 


66.9 


97486_Patient-09sk_skeleta 
1 muscle 


0.9 


1.2 


94743_Donor3U- | 
B_Meserichymal Stem Cells 


62.9 


70.7 


97487 JPatient-09ut_uteros 


8.5 


11.0 


94730_Donor3AM- 
A_adipose 


41.5 


46.7 


97488 Patient-09pl_placent 

k'" 


4.9 


4.2 


94731_Donor3AM- 
B_adipose 


29.7 


29.5 
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97492JPatient-10ut_uterus 


5.7 


5.8 


94739 DnJ'M^ 1 ' 0 
y4 / JZ_Uonor d AJV1 - 

CLadipose 


25.7 


36.6 


y i Hy5 jraaent-lupJLpiaceni 
a 


6.3 


7.0 


y4 / J3_uonor J - 
A_adipose 


97.3 


92.7 


y / 4yj_jrauenw igo_aaipos 
e 


7.3 


8.8 


y4 / j4_L>onor o /vl> - 
B_adipose 


58.6 


80.7 


y 1 4y o_.r anen t- 1 1 sK_SKeieta 
1 muscle 


1.7 


1.3 


y4 /ij^uonor o - 
CLadipose 


69.3 


83.5 


97497_JPatient-l lut_utems 


12.9 


15.7 


7TT20 T Mia« TTnriPOiintr^ofn 

/ /i3o_i^iver_riepvj/unireate 
d 


72.7 


80.7 


y / 4y &jranent-l lpj_piacerit 
a 


4.7 


6.8 


/ 3D jo_Heart_Caraiac stroma J 
cells (primary) 


2.6 • 


4.7 


y / juu_ratieiit-izgo_aaipos 
e 


9.5 


12.6 


81735_Small Intestine 


7.6 


8.9 


97501_Patient-12sk_skeleta 
1 muscle 


2.7 


2.4 


72409 JGdneyJProxitral 
Convoluted lubule 


4.6 


4.3 


97502JPatient-12ut_uterus 


9.3 


10.7 


82685JSmall 
intestineJDuodenum 


1.9 


2.0 


97503_Patient-12p]j)lacent 
a 


3.0 


3.1 


90650_AdrenaLAdrenocortic 
al adenoma 


1.4 


1.1 


94721_Donor2U- 
A_Mesenchynial Stem 
Cells 


50.3 


52.9 


72410JBdneyJHRCE 


21.9 


21.5 


94722JDonor2U : 
BJvIesenchymal Stem 
Cells 


45.4 


47.3 


72411J&dney_HRE 


15.7 


0.0 


94723_J>onor2U- 
C_MesenchymaI Stem 
Cells 


52.1 


45.4 


73139JUterusJJterine 
smooth muscle cells 


23.7 


28.3 



Table AUK, general oncology screening panel y 2.4 



Tissue Name 


ReL 

Exp.(%) 
Ag4075, 
Run 

259745203 


Tissue Nme 


Rel. 

Exp.(%) 
Ag4075, 
Run 

259745203 


Colon cancer 1 


50.7 


Bladder cancer NAT 2 


0.1 


Colon cancer NAT 1 


13.5 


Bladder cancer NAT 3 


0.0 


Colon cancer 2 


47.0 


Bladder cancer NAT 4 


0.1 


Colon cancer NAT 2 


24.3 


Prostate adenocarcinoma 1 


33.9 


Colon cancer 3 


95.9 


Prostate adenocarcinoma 2 


3.6 


Colon cancer NAT 3 


16.2 j 


Prostate adenocarcinoma 3 


26.4 


Colon malignant cancer 4 


55.9 


Prostate adenocarcinoma 4 


100.0 


Colon normal adjacent tissue 4 


6.2 


Prostate cancer NAT 5 


6.8 


Lung cancer 1 


11.4 


Prostate adenocarcinoma 6 


11.2 
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Lung NAT 1 |0.6 iProstateain^kokT 0 ^' 


g qAT , n \ti. .,,41 .tf » 


Lung cancer 2 


12.9 {Prostate adenocarcinoma 8 


2.6 


Lung NAT 2 


1.0 


Prostate adenocarcinoma 9 


38.2 


Squamous cell carcinoma 3 


62.0 


Prostate cancer NAT 10 


0.6 


Lung NAT 3 


1.1 


Kidney cancer 1 


7.9 


metastatic melanoma 1 


20.2 


KidneyNAT 1 


2.9 


Melanoma 2 


3.1 


Kidney cancer 2 


28.1 


Melanoma 3 


1.7 


Kidney NAT 2 


8.5 


metastatic melanoma 4 


57.0 


Kidney cancer 3 


13.9 


metastatic melanoma 5 


25.3 


Kidney NAT 3 


21 


Bladder cancer 1 


0.2 


Kidney cancer 4 


9.6 


Bladder cancer NAT 1 


0.0 


Kidney NAT 4 


11.2 


Bladder cancer 2 


11.7 







AI_comprehensive paneLvLO Summary: Ag4075 Highest expression is seen in 
an osteoarthritic bone sample (CT=27.31). This gene is expressed at moderate to low levels 
in many samples on this panel. Please see Panel 4.1 for discussion of this gene in 
inflammation. 

CNS_neurodegeneration_vl.O Summary: Ag4075 This panel does not show 
differential expression of this gene in Alzheimer's disease. However, this profile confirms 
the expression of this gene at moderate levels in the brain. Please see Panel 1.4 for 
discussion of this gene in the central nervous system. 

General_screeningj>aneLvl.4 Summary: Ag4075 Two experiments with the 
same probe and primer set produce results that are in excellent agreement. Highest 
expression is seen in a colon cancer cell line (CTs=21-22). Overall, expression of this gene 
appears to be highly associated with cancer cell line samples, with high levels oof 
expression in brain, colon, gastric, lung, breast, ovarian, and melanoma cancer cell lines. 
This expression profile suggests a role for this gene product in cell survival and 
proliferation. This gene encodes a protein with homology to Neutral amino acid transporter 
2. L type amino acid transporter 1 (LAT1) has been implicated in tumor growth and may 
play an important role in supplying nutrition to cells for cell proliferation (Ohkame, J Surg 
Oncol 2001 Dec;78(4):265-71; discussion 271-2). Thus, modulation of this gene product 
may be useful in the treatment of cancer. 

Among tissues with metabolic function, this gene is expressed at moderate levels in 
pituitary, adipose, adrenal gland, pancreas, thyroid, and adult and fetal skeletal muscle, 
heart, and liver. This widespread expression among these tissues suggests that this gene 
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product may play a role in normal neuroendocrine and n^Slc'jyjc^Sry^h^ ^ ^ r 
disregulated expression of this gene may contribute to neuroendocrine disorders or 
metabolic diseases, such as obesity and diabetes. 

This gene is also expressed at moderate levels in the CNS, including the 
hippocampus, thalamus, substantia nigra, amygdala, cerebellum and cerebral cortex. 
Therefore, therapeutic modulation of the expression or function of this gene may be useful 
in the treatment of neurologic disorders, such as Alzheimer's disease, Parkinson's disease, 
schizophrenia, multiple sclerosis, stroke and epilepsy. 

In addition, this gene is expressed at much higher levels in fetal lung and liver tissue 
(CTs=26-27) when compared to expression in the adult counterparts (CTs=31-33). Thus, 
expression of this gene may be used to differentiate between the fetal and adult sources of 
these tissues. 

General jscreening_panel_vl.5 Summary: Ag4075 Highest expression is seen in 
a colon cancer cell line (CT=20), with expression in this panel in strong agreement with 
Panel 1.4. Please see that panel for discussion of this gene in disease. 

Panel 3D Summary: Ag4075 Expression of this gene is widespread on this panel, 
with highest expression in a lung cancer cell line (CT=26). The widespread expression on 
this panel is in agreement with expression in Panels 1.4 and 1.5 where expression of this 
gene is highly associated with cancer cell line samples. Please see Panel 1.4 for discussion 
of this gene in oncology. 

Panel 4.1D Summary: Ag4075 Highest expression of this gene is seen in a sample 
derived from the Ramos B cell line treated with ionomycin (CT=27.3). In addition, this 
gene appears to be more highly expressed in activated T cells than in resting T cells. Thus, 
therapeutic regulation of the transcript or the protein encoded by the transcript could be 
important in immune modulation and in the treatment of T cell-mediated diseases such as 
asthma, arthritis, psoriasis, IBE>, and lupus. In addition, this gene is also expressed at 
moderate levels in a wide range of cell types of significance in the immune response in 
health and disease. These cells include members of the T-cell, B-cell, endothelial cell, 
macrophage/monocyte,, and peripheral blood mononuclear cell family, as well as epithelial 
and fibroblast cell types from lung and skin, and normal tissues represented by colon, lung, 
thymus and kidney. This ubiquitous pattern of expression suggests that this gene product 
may be involved in homeostatic processes for these and other cell types and tissues. This 
pattern is in agreement with the expression profile in General_screening_panel_vl.4 and 
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also suggests a role for the gene product in cell survival Jncf^^ 3& 3 

modulation of the gene product with a functional therapeutic may lead to the alteration of 
functions associated with these cell types and lead to improvement of the symptoms of 
patients suffering from autoimmune and inflammatory diseases such as asthma, allergies, 
inflammatory bowel disease, lupus erythematosus, psoriasis, rheumatoid arthritis, and 
osteoarthritis. 

Panel 5 Islet Summary: Ag4075 Highest expression is seen in adipose (CT=27). 
In addition, this expression of this gene is widespread on this panel, with moderate to high 
levels in metabolic tissues, including skeletal muscle, adipose, pancreatic islet cells and 
placenta. This gene codes for neutral amino acid transporter B(0)[ATB(0)]. ATB(O) 
transports the gluconeogenic amino acids 1-alanine and l-glutamine into cells. Excess 
neutral amino acid transport and a resultant increase in gluconeogenesis and triglyceride 
synthesis may impair beta cell function in obesity and Type 2 diabetes. Phannacologic 
inhibition of ATB(0) encoded by this gene may prevent or treat the symptoms of 
obesity-related Type 2 diabetes. 

Panel 5D Summary: Ag4075 Expression on this panel agrees with Panel 5L 
Highest expression is seen in adipose in two replicate experiments (CTs=28). Please see 
Panel 51 and 1.4 for further discussion of utility of this gene in metabolic disease. 

general oncology screening panel_v_2.4 Summary: Ag4975 Highest expression 
of this gene is seen in prostate cancer (CT=27). Prominent expression is also seen in 
melanoma and squamous cell carcinoma derived samples. In addition, this gene appears to 
be overexpressed in colon, lung, prostate cancer when compared to expression in the 
normal adjacent tissue. Thus, expression of this gene could be used as a marker to detect 
the presence of colon, lung and prostate cancer. Furthermore, therapeutic modulation of the 
expression or function of this gene may be effective in the treatment of colon, prostate, 
melanoma and lung cancer. 

Example D: Identification of Single Nucleotide Polymorphisms in NOYX 
nucleic acid sequences 

Variant sequences are also included in this application. A variant sequence can 
include a single nucleotide polymorphism (SNP). A SNP can, in some instances, be 
referred to as a "cSNP" to denote that the nucleotide sequence containing the SNP 
originates as a cDNA. A SNP can arise in several ways. For example, a SNP may be due to 
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a substitution of one nucleotide for another at the polymdpKc fife.^tM4'^^ 
be either a transition or a transversion. A SNP can also arise from a deletion of a 
nucleotide or an insertion of a nucleotide, relative to a reference allele. In this case, the 
polymorphic site is a site at which one allele bears a gap with respect to a particular 
5 nucleotide in another allele. SNPs occurring within genes may result in an alteration of the 
amino acid encoded by the gene at the position of the SNP. Intragenic SNPs may also be 
silent, when a codon including a SNP encodes the same amino acid as a result of the 
redundancy of the genetic code. SNPs occurring outside the region of a gene, or in an 
intron within a gene, do not result in changes in any amino acid sequence of a protein but 

10 may result in altered regulation of the expression pattern. Examples include alteration in 
temporal expression, physiological response regulation, cell type expression regulation, 
intensity of expression, and stability of transcribed message. 

SeqCalling assemblies produced by the exon linking process were selected and 
extended using the following criteria. Genomic clones having regions with 98% identity to 

15 all or part of the initial or extended sequence were identified by BLASTN searches using 
the relevant sequence to query human genomic databases. The genomic clones that 
resulted were selected for further analysis because this identity indicates that these clones 
contain the genomic locus for these SeqCalling assemblies. These sequences were 
analyzed for putative coding regions as well as for similarity to the known DNA and 

20 protein sequences. Programs used for these analyses include Grail, Genscan, BLAST, 
HMMER, FASTA, Hybrid and other relevant programs. 

Some additional genomic regions may have also been identified because selected 
SeqCalling assemblies map to those regions. Such SeqCalling sequences may have 
overlapped with regions defined by homology or exon prediction. They may also be 

25 included because the location of the fragment was in the vicinity of genomic regions 

identified by similarity or exon prediction that had been included in the original predicted 
sequence. The sequence so identified was manually assembled and then may have been 
extended using one or more additional sequences taken from CuraGen Corporation's human 
SeqCalling database. SeqCalling fragments suitable for inclusion were identified by the 

30 CuraTools™ program SeqExtend or by identifying SeqCalling fragments mapping to the 
appropriate regions of the genomic clones analyzed. 

The regions defined by the procedures described above were then manually 
integrated and corrected for apparent inconsistencies that may have arisen, for example, 
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from miscalled bases in the original fragments or from d^c^S&lelr'l^Mil ^ ' 

exon junctions, EST locations and regions of sequence similarity, to derive the final 
sequence disclosed herein. When necessary, the process to identify and analyze SeqCalling 
assemblies and genomic clones was reiterated to derive the full length sequence (Alderborn 
5 et al M Determination of Single Nucleotide Polymorphisms by Real-time Pyrophosphate 
DNA Sequencing. Genome Research. 10 (8) 1249-1265, 2000). 

Variants are reported individually but any combination of all or a select subset of 
variants are also included as contemplated NOVX embodiments of the invention. 

10 NOVlaSNPData: 

NOVla has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ED NOs:l and 2, respectively. The 
nucleotide sequence of the NOVla variant differs as shown in Table SNP1. 

15 Table SNP1. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13375555 


4319 


C 


T ! 


1440 


Pro 


Leu 



20 

NOV2b SNP Data: 

NOV2b has six SNP variants, whose variant positions for its nucleotide and amino 
acid sequences are numbered according to SEQ ID NOs:17 and 18, respectively. The 
nucleotide sequence of the NOV2b variant differs as shown in Table SNP2. 

25 

Table SNP2. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


12252060 


100 


A 


T 


34 


He 


Phe 


13380837 


204 


A 


C 


68 


Thr 


Thr 
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13380838 


209 


G 


A 


I PCT.,-- 
70 


y Li) k 

Gly 


Asp 


13380839 


254 


A 


G 


85 


Gin 


Arg 


13380843 


605 


C 


T 


202 


Ala 


Val 


13380844 


614 


C 


T 


205 


Ala 


Val 



NOV3bSNPData: 

NOV3b has seven SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:21 and 22, respectively. 
The nucleotide sequence of the NOV3b variant differs as shown in Table SNP3. 

Table SNP3. 



Variant 


Nucleotides 




Amino Acids 




Position 


Initial 


Modified 


Position 


Initial 


Modified 


13375856 


338 


G 


A 


0 






13380855 


397 


T 


G 


0 






13380857 


1134 


T 


C 


243 


Val 


Ala 


13375853 


1362 


G 


A 


319 


Arg 


His 


13380859 


1376 


A 


G 


324 


Thr 


Ala 


13380860 


1426 


C 


T 


340 


Cys 


Cys 


13380861 


1496 


C 


T 


0 







NOV4b SNP Data: 

NOV4b has eleven SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:27 and 28, respectively. 
The nucleotide sequence of the NOV4b variant differs as shown in Table SNP4. 

Table SNP4. 
Variant I Nucleotides 



Amino Adds 
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Position 


Initial 


Modified 


- RC TV 
Position 


Initial 


Modified 


13380847 


73 


G 


C 


12 


Arg 


Pro 


13380848 


116 


G 


A 


26 


Arg 


Arg 


13380849 


117 


A 


T 


27 


lie 


Phe 


13380862 


200 


G 


T 


54 


Lys 


Asn 


13380863 


222 


G 


T 


62 


Glu 


End 


13380864 


243 


G 


T 


69 


Glu 


End 


13380850 


338 


C 


T 


100 


He 


He 


13380851 


438 


G 


T 


134 


Ala 


Ser 


13380865 


779 


A 


T 


247 


Pro 


Pro 


13380852 


1023 


C 


G 


329 


Pro 


Ala 


13380853 


1494 


C 


T 


0 
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NOV6aSNPData: 

5 NOV6a has two SNP variants, whose variant positions for its nucleotide and amino 

acid sequences are numbered according to SEQ ID NOs:33 and 34, respectively. The 
nucleotide sequence of the NOV6a variant differs as shown in Table SNP5. 

Table SNP5. 

10 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380868 


1646 


T 


C 


539 


Val 


Ala 


.13380869 


2992 


T 


C 


988 


Cys 


Arg 



15 NOVlla SNP Data: 

NOV1 1 a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:47 and 48, respectively. The 
nucleotide sequence of the NOV1 la variant differs as shown in Table SNP6. 

20 . Table SNP6. 
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Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified | 


13380962 


41 


G 


T 


0 







NOV12aSNPData: 

NOV12a has three SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:63 and 64, respectively. 
The nucleotide sequence of the NOV12a variant differs as shown in Table SNP7. 

Table SNP7. 



Variant 


Nucleotides 


Amino Adds 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380902 


594 


C 


T 


193 


Ser 


Ser 


13380901 


1392 


A 


G 


0 






13380900 


1425 


C 


T 


0 j 







NOV13a SNP Data: 

NOV13a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:65 and 66, respectively. The 
nucleotide sequence of the NOV13a variant differs as shown in Table SNP8. 

Table SNP8. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380964 | 


204 


C 


T 


68 


Leu 


Leu 
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NOV14a SNP Data: 

NOV14a has two SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:73 and 74, respectively. 
The nucleotide sequence of the NOVHa variant differs as shown in Table SNP9. 

Table SNP9. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380922 


106 


C 


G 


28 


Pro 


Pro 


13380923 


760 


A 


G 


246 


Pro 


Pro 
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NOV15a SNP Data: 



NOV15a has two SNP variants, whose variant positions for its nucleotide and 
15 amino acid sequences are numbered according to SEQ ID NOs:77 and 78, respectively. 
The nucleotide sequence of the NOV15a variant differs as shown in Table SNP10. 

Table SNP10. 



20 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380896 


19 


T 


c 


4 


Phe 


Leu 


13380897 


258 


G 


A 


83 


Pro 


Pro 



NOV20a SNP Data: 

25 

NOV20a has seven SNP variants, whose variant positions for its nucleotide and 
amino acid sequences ate numbered according to SEQ ID NOs: 107 and 108, respectively. 
The nucleotide sequence of the NOV20a variant differs as shown in Table SNP11. 
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Table SNP11. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380969 


155 


G 


A 


0 






13380970 


448 


A 


G 


79 


His 


Arg 


13380971 


475 


G 


C 


88 


Cys 


Ser 


13380972 


780 


A 


G 


190 


Arg 


Gly 


13380974 '• 


890- " 


A 


G 


226 


Arg 


Arg 


13380975 


1798 


A 


G 


0 






13380976 


2564 


A ■ [ 


G 


0 







5 

NOV26a SNP Data: 

NOV26a has one SNP variant, whose variant positions for its nucleotide and amino 
10 acid sequences is numbered according to SEQ ID NOs:119 and 120, respectively. The 
nucleotide sequence of the NOV26a variant differs as shown in Table SNP12. 

Table SNP12. 

15 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13377803 


98 


G 


A 


25 


Met 


Ee 



NOV27a SNP Data: 

20 

NOV27a has two SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:121 and 122, respectively. 
The nucleotide sequence of the NOV27a variant differs as shown in Table SNP13. 
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Table SNP13. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380980 


186 


A 


G 


22 


Thr 


Ala 


13380979 


292 


C 


T 


57 


Thr 


lie 



5 

NOV28aSNPData: 

NOV28a has two SNP variants, whose variant positions for its nucleotide and 
10 ' amino acid sequences are numbered according to SEQ ID NOs:123 and 124, respectively. 
The nucleotide sequence of the NOV28a variant differs as shown in Table SNP14. 

Table SNP14. 

15 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380981 


2192 


G 


A 


721 


Arg 


Lys 


13380982 


2283 


C 


T 


751 


Phe 


Phe 



NOV29a SNP Data: 

20 

NOV29a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:127 and 128, respectively. The 
nucleotide sequence of the NOV29a variant differs as shown in Table SNP15. 

25 Table SNP15. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 
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13380985 



46 



PCT 



' U9DE 



NOV31aSNPData: 

5 

NOV31a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs: 1 33 and 134, respectively. The 
nucleotide sequence of the NOV3 1 a variant differs as shown in Table SNP16. 

10 Table SNP16. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380984 


1232 


G 


A 


335 


Gly 


Ser 



NOV34a SNP Data: 



NOV34a has two SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ED NOs: 141 and 142, respectively. 
20 The nucleotide sequence of the NOV34a variant differs as shown in Table SNP17. 



Table SNP17. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380987 


1145 


G 


C 


362 


Arg 


Thr 


13380988 


1749 


A 


T 


0 







NOV35a SNP Data: 
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NOV35a has one SNP variant, whose variant pos^QXi^i&l^iS^dfe^aSxf^lflS 3 
acid sequences is numbered according to SEQ JD NOs:143 and 144, respectively. The 
nucleotide sequence of the NOV35a variant differs as shown in Table SNP18. 

Table SNP18. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380995 


85 


C 


T 


22 


Thr 


lie 



NOV36a SNP Data: 

NOV36a has three SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:153 and 154, respectively. 
The nucleotide sequence of the NOV36a variant differs as shown in Table SNP19. 

Table SNP19. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380998 


411 


G 


A 


122 


Ser 


Asn 


13381013 


492 


T 


C 


149 


Leu [ 


Pro 


13380999 


686 


T 


c j 


214 


Cys 


Arg 



NOV37a SNP Data: 

NOV37a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ JD NOs:155 and 156, respectively. The 

nucleotide sequence of the NO V37a variant differs as shown in Table SNP20. 

r 
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Table SNP20. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381009 


2077 


C 


G 


0 







5 

NOV38aSNPData: 

NOV38a has one SNP variant, whose variant positions for its nucleotide and amino 
10 acid sequences is numbered according to SEQ ID NOs:157 and 158, respectively. The 
nucleotide sequence of the NOV38a variant differs as shown in Table SNP21. 

Table SNP21. 

15 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13378369 


994 


C 


T 


330 


Ser 


Leu 



NOV40a SNP Data: 

20 

NOV40a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:167 and 168, respectively. The 
nucleotide sequence of the NOV40a variant differs as shown in Table SNP22. 

25 Table SNP22. 



Variant 


Nucleotides 


Amino Adds 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381011 


32 


A 


G 


0 
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NOV41aSNPData: 

5 NOV41a has two SNP variants, whose variant positions for its nucleotide and 

amino acid sequences are numbered according to SEQ ID NOs:173 and 174, respectively. 
The nucleotide sequence of the NOV41a variant differs as shown in Table SNP23. 

Table SNP23. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380997 


247 


A 


G 


55 


Asn 


Asp 


13380996 


417 


A 


G 


111 


Lys 


Lys 



15 NOV43a SNP Data: 

NOV43a has eight SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:181 and 182, respectively. 
The nucleotide sequence of the NOV43a variant differs as shown in Table SNP24. 

20 

Table SNP24. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381140 


184 


G 


A 


61 


Asp 


Asn 


13381141 


337 


T 


C 


112 


Phe 


Leu 


13381158 


729 


G 


T 


242 


Met 


lie 


13381157 


748 


A 


G 


249 


Ser 


Gly 


13381156 


934 


T 


C 


311 


Phe 


Leu 


13381142 


1916 


A 


G 


0 






13381143 


2123 ! 


T 


A 


0 
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13381148 


2260 






Q iP'CT, 


"usoa 


1*' mm!' mih* w*ul^ 


G 


C 





3 



NOV44aSNPData: 

NOV44a has one SNP variant, whose variant positions for its nucleotide and amino 
acid sequences is numbered according to SEQ ID NOs:183 and 184, respectively. The 
nucleotide sequence of the NOV44a variant differs as shown in Table SNP25. 



10 



Table SNP25. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381168 


1096 


C 


T 


0 







15 

NOV45a SNP Data: 

NOV45a has two SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:185 and 186, respectively. 
20 The nucleotide sequence of the NOV45a variant differs as shown in Table SNP26. 

Table SNP26. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381163 


1269 


T 


C 


399 


Cys 


Arg 


13381162 


1418 


C 


T 


0 







25 

NOV46a SNP Data: 
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NOV46a has one SNP variant, whose variant pos^Q IG/iiyii^B^bi^a^fla^^ 3 
acid sequences is numbered according to SEQ ID NOs:187 and 188, respectively. The 
nucleotide sequence of the NOV46a variant differs as shown in Table SNP27. 

5 Table SNP27. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381020 


820 


T 


C 


267 


Phe 


Phe 



10 

NOV48b SNP Data: 

NOV48b has five SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:193 and 194, respectively. 
15 The nucleotide sequence of the NOV48b variant differs as shown in Table SNP28. 

Table SNP28. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13375777 


107 


A 


G 


14 


His 


Arg 


13376584 


116 


G 


A 


17 


Ser 


Asn . 


13381146 


448. 


T 


C 


128 


Cys 


Arg 


13378857 


1282 


G 


A 


406 


Gly 


Ser 


13376583 


1297 ! 


C 


T 


411 


Pro 


Ser 



20 

NOV49a SNP Data: 

25 NOV49a has twenty-one SNP variants, whose variant positions for its nucleotide 

and amino acid sequences are numbered according to SEQ ID NOs:195 and 196, 
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respectively. The nucleotide sequence of the NOV49a variant 'd5ffers J ^fio'wh1n^ , a&le s * ~ s 
SNP29. 

Table SNP29. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13379126 


186 


C 


T 


17 


Ala 


Val 


13375663 


212 


C 


G 


26 


Leu 


Val 


13375662 


213 


T 


C 


26 


Leu 


Pro 


13379016 


293 


A 


G 


53 


Ser 


Gly 


13378698 


388 


C 


T 


84 


Phe 


Phe 


13381282 


401 


C 


T 


89 


Gin 


End 


13381193 


556 


A 


C 


140 


Thr 


Thr 


13381194 


577 


G 


A 


147 


Gly 


Gly 


13381283 


631 


A 


G 


165 


Lys 


Lys 


13378699 


840 


G 


A 


235 


Ser 


Asn 


13378106 


909 


A 


G 


258 


Asp 


Gly 


13381284 


924 


A 


G 


263 


Lys 


Arg 


13377887 


954 


A 


G 


273 


Glu 


Gly 


13381285 


967 ; 


C 


T 


277 


Gly 


Gly 


13381286 


1009 


A j 


G 


291 


Thr 


Thr 


13377889 


1083 


A 


G 


316 


Gin j 


Arg 


13381287 


1107 


A 


G j 


324 


Glu 


Gly 


13377890 


1113 


T 


C 


326 


Val 


Ala 


13377891 


1137 


A 


C 


334 


Gin 


Pro 


13381288 


1196 


C 


G 


0 






13381289 


1202 


A 


G 


0 







NOV50bSNPData: 



569 
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NOV50b has three SNP variants, whose variant pdsiffitaS tof its'TOCreotide-Mtl^ •* 
amino acid sequences are numbered according to SEQ ID NOs:219 and 220, respectively. 
The nucleotide sequence of the NOV50b variant differs as shown in Table SNP30. 

5 Table SNP30. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381192 


216 


G 


A 


48 


Glu 


Glu 


13381177 


602 


G 


T 


177 


Arg 


Leu 


13381190 


698 


C 


T 


0 







NOV52b SNP Data: 

NOV52b has eight SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:229 and 230, respectively. 
15 The nucleotide sequence of the NOV52b variant differs as shown in Table SNP31 . 

Table SNP31. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13381176 


215 


A 


G 


43. 


Glu 


Glu 


13376180 


320 


C 


T 


78 


Tyr 


Tyr 


13376179 


397 


A 


G 


104 


Gin 


Arg 


13381171 


519 


T 


C 


145 


Ser 


Pro ! 


13381174 


629 


C 


T 


181 


He 


He 


13381173 


1173 


C 


A 


363 


Gin 


Lys 


13381172 


1174 


A 


C 


363 


Gin 


Pro 


13381169 


1402 


A 


G 


0 







20 

570 



II r 

WO 03/029424 



PCT7US02/31373 



NOV53cSNPData: 

NOV53c has two SNP variants, whose variant positions for its nucleotide and 
5 amino acid sequences are numbered according to SEQ ID NOs:237 and 238, respectively. 
The nucleotide sequence of the NOV53c variant differs as shown in Table SNP32. 

Table SNP32. 

10 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified 


13380578 


424 


C 


T 


136 


Asp 


Asp 


13380577 


869 


A 


G 


285 


Thr 


Ala 



NOV55a SNP Data: 

15 

NOV55a has thirteen SNP variants, whose variant positions for its nucleotide and 
amino acid sequences are numbered according to SEQ ID NOs:245 and 246, respectively. 
The nucleotide sequence of the NOV55a variant differs as shown in Table SNP33. 

20 Table SNP33. 



Variant 


Nucleotides 


Amino Acids 


Position 


Initial 


Modified 


Position 


Initial 


Modified' 


13375283 


272 


c 


T 


0 






13375284 


281 


T 


C 


0 






13377920 


1226 


T 


C 


203 


Ser 


Pro 


13377921 


1447 


C 


T 


276 


Tyr 


Tyr 


13377922 


1765 


C 


T 


382 


Gly 


Gly 


13377907 


2021 


A 


G 


468 


Thr 


Ala 


13377908 


2074 " 


T 


C 


485 


Tyr 


Tyr 
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13375287 


2153 


G 


C 


S12 rL " 


"^3 lie- 


< 3r JL S/' 

Leu 


3 


13375288 


2157 


C 


T 


513 


Pro 


Leu 




13375289 


2160 


C 


T i 


514 


Thr 


He 




13375290 


2329 


G 


A 


0 








13377903 


2417 


A 


G 


0 








13377904 


2559 


C 


T 


0 









Example E: Potential Role(s) of CG96736-01 in Obesity and/or Diabetes 

5 The NOV55a gene (CG96736-01) is a Na+-dependent neutral amino acid 

transporter that exhibits high affinity electroneutral uptake of neutral amino acids such as 
L-alanine, L-serine, L-threonine, L-cysteine and L-glutamine. This transporter prefers 
neutral amino acids without bulky or branched side chains. It is localized to the plasma 
membrane and has eight putative transmembrane segments. It appears to be a Type Dla 
10 membrane protein with an N-tenninal cytoplasmic tail and a C-terminal extracellular 

segment. In this respect, the expression patter and its function in nutral amino acid uptake 
is an indication of a role for NOV55a in obesity and/or diabetes. 

Obesity and Diabetes are major public health concerns in the developed and 
developing world. It is estimated that over half of the adult US population is overweight 

15 with a body mass index (BMI) greater than the upper limit of normal (25) where the BMI is 
defined as the weight (Kg) / [height (m)] 2 . A common consequence of being overweight is 
hyperlipidemia and the development of insulin resistance. This is followed by the 
development of hyperglycemia - a hallmark of Type II diabetes. Left untreated, the 
hyperglycemia leads to microvascular disease and end organ damage that includes 

20 retinopathy, renal disease, cardiac disease, peripheral neuropathy and peripheral vascular 
compromise. Currently, over 16 million adults in the US are affected and the condition has 
now become rampant among school-age children as a consequence of the epidemic of 
obesity in that age group. 

Several cellular, animal and clinical studies were performed to elucidate the genetic 
25 contribution to the etiology and pathogenesis of these conditions in a variety of 

physiologic, pharmacologic or native states. These studies utilized the core technologies at 
CuraGen Corporation to look at differential gene expression, protein-protein interactions, 
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large-scale sequencing of expressed genes and the assocMtidrf of gefte'tltf V^ationS'sttch^aS,-'*? 
but not limited to, single nucleotide polymorphisms (SNPs) or splice variants in and 
between biological samples from experimental and control groups. The goal of such 
studies is to identify potential avenues for therapeutic intervention in order to prevent, treat 
5 the consequences or cure the conditions. 

In order to treat diseases, pathologies and other abnormal states or conditions in 
which a mammalian organism has been diagnosed as being, or as being at risk for 
becoming, other than in a normal state or condition, it is important to identify new 
therapeutic agents. Such a procedure includes at least the steps of identifying a target 

10 component within an affected tissue or organ, and identifying a candidate therapeutic agent 
that modulates the functional attributes of the target. The target component may be any 
biological macromolecule implicated in the disease or pathology. Commonly the target is a 
polypeptide or protein with specific functional attributes. Other classes of macromolecule 
may be a nucleic acid, a polysaccharide, a lipid such as a complex lipid or a glycolipid; in 

15 addition a target may be a sub-cellular structure or extra-cellular structure that is comprised 
of more than one of these classes of macromolecule. Once such a target has been 
identified, it may be employed in a screening assay in order to identify favorable candidate 
therapeutic agents from among a large population of substances or compounds. 

In many cases the objective of such screening assays is to identify small molecule 
20 candidates; this is commonly approached by the use of combinatorial methodologies to 
develop the population of substances to be tested. The implementation of high throughput 
screening methodologies is advantageous when working with large, combinatorial libraries 
of compounds. 

In an important aspect, the present invention provides a method of identifying a 
25 candidate therapeutic agent for treating a disease, pathology, or an abnormal state or 

condition using a target entity having a specific association with the disease. This method 
includes: 

(a) identification of a target biopolymer associated with the disease, pathology, 
or abnormal state or condition; 

30 (b) contacting the biopolymer with at least one chemical compound; and 

(c) identifying a compound that binds to the biopolymer as a candidate 
therapeutic agent. 
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In important embodiments of this method, the chemical compound is a member of a 
combinatorial library of compounds; the contacting in step (b) is conducted on one or more 
replicate samples of the biopolymer; and the replicate sample is contacted with at least one 
5 member of the combinatorial library. In additional embodiments of this method, the 

biopolymer is included within a cell and is functionally expressed therein. In still a further 
advantageous embodiment, the binding of the compound modulates the function of the 
biopolymer, and it is the modulation that provides the identification that the compound is a 
potential therapeutic agent. In yet further significant embodiments of this method, the 
10 target biopolymer is a polypeptide. 

In a second aspect of the invention, a method for identifying a pharmaceutical agent 
for treating a disease, pathology, or an abnormal state or condition is provided. The second 
method includes the steps of: 

(a) identifying a candidate therapeutic agent for treating said disease, pathology, 
15 or abnormal state or condition by the method described in the preceding paragraph; 

(b) contacting a biological sample associated with the disease, pathology, or 
abnormal state or condition with the candidate therapeutic agent; 

(c) determining whether the candidate induces an effect on the biological 
sample associated with a therapeutic response therein; and 

20 (d) identifying a candidate exerting such an effect as a pharmaceutical agent. 

In significant embodiments of the second method, the biological sample includes a 
cell, a tissue or organ, or is a nonhuman mammal. 

A gene fragment of the mouse Neutral Amino Acid Transporter B was initially 
found to be up-regulated by 6 fold in the adipose tissue of obese mice (AKR) relative to 

25 non-obese mice (C57BL/6J) using CuraGen's GeneCalling™ method of differentia] gene 
expression. Two differentially expressed mouse gene fragments migrating, at 
approximately 138 and 347 nucleotides in length (Tables MOU-3A and MOU-3B for 
NOV55c (SEQ ID NO:438), and Tables MOU-3C and MOU-3D for NOV55d (SEQ ID 
NO:439) respectively - vertical line) were definitively identified as a component of the 

30 Mouse Neutral Amino Acid Transporter B cDNA (in the graphs, the abscissa is measured 
in lengths of nucleotides and the ordinate is measured as signal response). The method of 
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competitive PGR was used for conformation of the gene aSsefeSiftent^TlTS^ 
electropherogramatic peaks corresponding to the gene fragment of the mouse Neutral 
Amino Acid Transporter B are ablated when a gene-specific primer competes with primers 
in the linker-adaptors during the PCR amplification. The peaks at 138 nt length are ablated 
in the sample from both the obese and non-obese mice. 

The direct sequences of the 138.4 and 346.7 nucleotide-long gene fragments and the 
gene-specific primers used for competitive PCR are indicated on the cDNA sequence of the 
Mouse Neutral Amino Acid Transporter B are shown below in bold. The gene-specific 
primers at the 5* and 3' ends of the fragment are in italics. 

Competitive PCR Primer for the Mouse Neutral Amino Acid Transporter B (peak at 

138.4). 

Table MOU-1. NOV55c Gene Sequence (fragment from 564 to 700 in bold, band 
size: 137) (SEQ ID NO:438) 

83 CCAGAGAGGA CCAGAGTGCG AAAGCAGGTG GTTGCTGCGG TTCCCGTGAC CGGGTGCGCC 143 

GCTGCATTCG CGCCAACCTG CTGGTGCTGC TCACGGTGGC TGCGGTGGTG GCTGGCGTGG 203 

GGCTGGGGCT GGGGGTCTCG GCGGCGGGCG GTGCTGACGC GCTGGGTCCC GCGCGCTTGA 263 

CCGCTTTCGC CTTCCCGGGA GAGCTGCTGC TGCGTCTGCT GAAGATGATC ATCCTGCCGC 323 

TCGTGGTGTG CAGCCTGATC GGAGGTGCAG CCAGCTTGGA CCCTAGCGCG CTCGGTCGTG 383 

TGGGCGCCTG GGCGCTGCTC TTTTTCCTGG TCACCACACT GCTCGCGTCG GCGCTCGGCG 443 

TGGGTTTGGC CCTGGCGCTG AAGCCGGGCG CCGCCGTTAC CGCCATCACC TCCATCAACG 503 

ACTCTGTTGT AGACCCCTGT GCCCGCAGTG CACCAACCAA AGAGGTGCTG GATTCCTTTC 563 

HAGATCTCGT GAGGAATATT TTCCCCTCCA ATCTGGTGTC TGCTGCCTTC CGCTCTTTTG 623 

CTACCTCATA TGAACCCAAA GACAACTCAT GTAAAATACC GCAATCCTGT ATCCAGCGGG 683 

AGATCAATTC AACCATGGTC CAGCTTCTCT GTGAGGTGGA GGGAATGAAC ATCCTGGGCC 743 

TGGTGGTCTT CGCTATCGTC TTTGGTGTGG CTCTGCGGAA GCTGGGGCCC GAGGGTGAGC 803 

TGCTCATTCG TTTCTTCAAC • TCCTTCAATG ATGCCACCAT GGTCCTGGTC TCCTGGATTA 863 

TGTGGTACGC ACCCGTTGGA ATCCTGTTCC TGGTGGCCAG CAAGATTGTG GAGATGAAAG 923 

ACGTCCGCCA GCTCTTCATC AGCCTCGGCA AATACATTCT GTGCTGCCTG CTGGGCCACG 983 

CCATCCACGG GCTCCTGGTT CTGCCTCTCA. TCTACTTCCT CTTCACCCGC AAAAATCCCT 1043 

ATCGATTCCT GTGGGGCATC ATGACACCCC TGGCCACTGC TTTCGGGACC TCTTCTAGCT 1103 

CTGCCACCTT GCCTCTGATG ATGAAGTGTG TAGAGGAGAA GAATGGTGTG GCCAAACACA 1163 

tcagccggtt catcctac (gene length is 1668, only region from 83 to 1180 shown) 

Competitive PCR Primer for the Mouse Neutral Amino Acid Transporter B (peak at 
346.7). The gene-specific primers at the 5' and 3' ends of the fragment are in italics. 

Table MOU-2. NOV55d Gene Sequence (fragment from 1 to 347 in italics, band 
size: 347) (SEQ ED NO:439) 

QGATCCCTGC CGCACCG&CA CTGGATGCTG TGGCTGTGAC CCTGGGGAAG AGAAGAGCGG 61 
AGATGGCAGA ATCATGGGGG CGGGGCCTCC TGCCACAGCC CCTGGCACTC ACAGGATGGT 121 
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GATGATCTTC ACGAAGTCCA GGGACACCCC GTTTAGTTGT GCGATGA4Mjc3foddHfcSfi Q E ^£ 
ACACTGGAAC AGCGCCGCCC CGTCCATGTT GACCGTGGCG CCGATGGGTA GGATGAACCG 241 
GCTGATGTGT TTGGCCACAC CATTCTTCTC CTCTACACAC TTCATCATCA GAGGCAAGGT 301 
GGCAGAGCTA GAAGAGGTCC CGAAAGCAGT GGCCAGGGGT GTCATGA 

(gene length is 347, only region from 1 to 347 shown) 

Nucleic acid and amino acid sequences for NOV55a and NOV55b are disclosed in 
Table 55a, SNPs for NOV55a and NOV55b are disclosed in Table SNP33 and quantitative 
expression of these genes is shown in Tables AUA - AUK in Example D. 

Tables MOU-3A and MOU-3B show differentially expressed mouse neutral amino 
acid transporter B gene fragment, NOV55c, and Tables MOU-3C and MOU-3D shows 
differentially expressed mouse neutral amino acid transporter B gene fragment, NOV55d. 

Tables MOU-3A and MOU-3B. Differentially Expressed Mouse Neutral Amino 
Acid Transporter B Gene Fragment, NOV55c. 
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Tables MOU-3C and MOU-3D. Differentially Expressed Mouse Neutral Amino 
Acid Transporter B Gene Fragment, NOV55d. 
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OTHER EMBODIMENTS 

Although particular embodiments have been disclosed herein in detail, this has been 
done by way of example for purposes of illustration only, and is not intended to be limiting 
with respect to the scope of the appended claims, which follow. In particular, it is 
contemplated by the inventors that various substitutions, alterations, and modifications may 
be made to the invention without departing from the spirit and scope of the invention as 
defined by the claims. The choice of nucleic acid starting material, clone of interest, or 
library type is believed to be a matter of routine for a person of ordinary skill in the art with 
knowledge of the embodiments described herein. Other aspects, advantages, and 
modifications considered to be within the scope of the following claims. The claims 
presented are representative of the inventions disclosed herein. Other, unclaimed 
inventions are also contemplated. Applicants reserve the right to pursue such inventions in 
later claims. 
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CLAIMS 

What is claimed is: 

L An isolated polypeptide comprising the mature form of an amino acid 
sequenced selected from the group consisting of SEQ ID NO:2n, wherein n is an integer 
between 1 and 124. 

2. An isolated polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 and 124. 

3. An isolated polypeptide comprising an amino acid sequence which is at 
least 95% identical to an amino acid sequence selected from the group consisting of SEQ 
ED NO:2n, wherein n is an integer between 1 and 124. 

4. An isolated polypeptide, wherein the polypeptide comprises an amino acid 
sequence comprising one or more conservative substitutions in the amino acid sequence 
selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 
and 124. 

5. The polypeptide of claim 1 wherein said polypeptide is naturally occurring. 

6. A composition comprising the polypeptide of claim 1 and a carrier. 

7. A kit comprising, in one or more containers, the composition of claim 6. 

8. The use of a therapeutic in the manufacture of a medicament for treating a 
syndrome associated with a human disease, the disease selected from a pathology 
associated with the polypeptide of claim 1, wherein the therapeutic comprises the 
polypeptide of claim 1 . 

9. A method for determining the presence or amount of the polypeptide of 
claim 1 in a sample, the method comprising: 
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(a) providing said sample; 

(b) introducing said sample to an antibody that binds immuriospecifically to the 
polypeptide; and 

(c) determining the presence or amount of antibody bound to said polypeptide, 
thereby determining the presence or amount of polypeptide in said sample. 

10. A method for determining the presence of or predisposition to a disease 
associated with altered levels of expression of the polypeptide of claim 1 in a first 
mammalian subject, the method comprising: 

a) measuring the level of expression of the polypeptide in a sample from the 
first mammalian subject; and 

b) comparing the expression of said polypeptide in the sample of step (a) to 
the expression of the polypeptide present in a control sample from a second 
mammalian subject known not to have, or not to be predisposed to, said 
disease, 

wherein an alteration in the level of expression of the polypeptide in the first subject as 
compared to the control sample indicates the presence of or predisposition to said disease. 

11. A method of identifying an agent that binds to the polypeptide of claim 1 , 
the method comprising: 

(a) introducing said polypeptide to said agent; and 

(b) detennining whether said agent binds to said polypeptide. 

12. The method of claim 1 1 wherein the agent is a cellular receptor or a 
downstream effector. 

13. A method for identifying a potential therapeutic agent for use in treatment 
of a pathology, wherein the pathology is related to aberrant expression or aberrant 
physiological interactions of the polypeptide of claim 1 , the method comprising: 

(a) providing a cell expressing the polypeptide of claim 1 and having a 
property or function ascribable to the polypeptide; 

(b) contacting the cell with a composition comprising a candidate substance; 
and 
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(c) determining whether the substance alters the property or function ascribable 
to the polypeptide; 

whereby, if an alteration observed in the presence of the substance is not observed when 
the cell is contacted with a composition in the absence of the substance, the substance is 
identified as a potential therapeutic agent. 

14. A method for screening for a modulator of activity of or of latency or 
predisposition to a pathology associated with the polypeptide of claim 1, said method 
comprising: 

(a) administering a test compound to a test animal at increased risk for a 
pathology associated with the polypeptide of claim 1 , wherein said test 
animal recombinantly expresses the polypeptide of claim 1 ; 

(b) measuring the activity of said polypeptide in said test animal after 
administering the compound of step (a); and 

(c) comparing the activity of said polypeptide in said test animal with the 
activity of said polypeptide in a control animal not administered said 
polypeptide, wherein a change in the activity of said polypeptide in said test 
animal relative to said control animal indicates the test compound is a 
modulator activity of or latency or predisposition to, a pathology associated 
with the polypeptide of claim 1. 

15. The method of claim 14, wherein said test animal is a recombinant test 
animal that expresses a test protein transgene or expresses said transgene under the control 
of a promoter at an increased level relative to a wild-type test animal, and wherein said 
promoter is not the native gene promoter of said transgene. 

16. A method for modulating the activity of the polypeptide of claim 1 , the 
method comprising contacting a cell sample expressing the polypeptide of claim 1 with a 
compound that binds to said polypeptide in an amount sufficient to modulate the activity 
of the polypeptide. 

17. A method of treating or preventing a pathology associated with the 
polypeptide of claim 1, the method comprising administering the polypeptide of claim 1 to 

580 



WO 03/029424 



PCTYUS02/31373 



a subject in which such treatment or prevention is desired in an amount sufficient to treat 
or prevent the pathology in the subject. 

1 8. The method of claim 17, wherein the subject is a human. 

19. A method of treating a pathological state in a mammal, the method 
comprising administering to the mammal a polypeptide in an amount that is sufficient to 
alleviate the pathological state, wherein the polypeptide is a polypeptide having an amino 
acid sequence at least 95% identical to a polypeptide comprising the amino acid sequence 
selected from the group consisting of SEQ ID NO:2n, wherein n is an integer between 1 
and 124 or a biologically active fragment thereof. 

20. An isolated nucleic acid molecule comprising a nucleic acid sequence 
selected from the group consisting of SEQ ID NO:2n-l, wherein n is an integer between 1 
and 124. 

21. The nucleic acid molecule of claim 20, wherein the nucleic acid molecule is 
naturally occurring. 

22. A nucleic acid molecule, wherein the nucleic acid molecule differs by a 
single nucleotide from a nucleic acid sequence selected from the group consisting of SEQ 
ID NO: 2n-l, wherein n is an integer between 1 and 124. 

23. An isolated nucleic acid molecule encoding the mature form of a 
polypeptide having an amino acid sequence selected from the group consisting of SEQ ID 
NO:2n, wherein n is an integer between 1 and 124. 

24. An isolated nucleic acid molecule comprising a nucleic acid selected from 
the group consisting of 2n-l, wherein n is an integer between 1 and 124. 

25. The nucleic acid molecule of claim 20, wherein said nucleic acid molecule 
hybridizes under stringent conditions to the nucleotide sequence selected from the group 
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consisting of SEQ ID NO: 2n-l, wherein n is an integer between 1 and 124, or a 
complement of said nucleotide sequence. 

26. A vector comprising the nucleic acid molecule of claim 20. 

27. The vector of claim 26, further comprising a promoter operably linked to 
said nucleic acid molecule. 

28. A cell comprising the vector of claim 26. 

29. An antibody that immunospecifically binds to the polypeptide of claim 1 . 

30. The antibody of claim 29, wherein the antibody is a monoclonal antibody. 

3 1 . The antibody of claim 29, wherein the antibody is a humanized antibody. 

32. A method for determining the presence or amount of the nucleic acid 
molecule of claim 20 in a sample, the method comprising: 

(a) providing said sample; 

(b) introducing said sample to a probe that binds to said nucleic acid molecule; 
and 

(c) detennining the presence or amount of said probe bound to said nucleic 
acid molecule, 

thereby detennining the presence or amount of the nucleic acid molecule in said sample. 

33. The method of claim 32 wherein presence or amount of the nucleic acid 
molecule is used as a marker for cell or tissue type. 

34. The method of claim 33 wherein the cell or tissue type is cancerous. 

35. A method for determining the presence of or predisposition to a disease 
associated with altered levels of expression of the nucleic acid molecule of claim 20 in a 
first mammalian subject, the method comprising: 

582 



WO 03/029424 



PCT/US02/31373 



a) measuring the level of expression of the nucleic acid in a sample from the 
first mammalian subject; and 

b) comparing the level of expression of said nucleic acid in the sample of step 
(a) to the level of expression of the nucleic acid present in a control sample 
from a second mammalian subject known not to have or not be predisposed 
to, the disease; 

wherein an alteration in the level of expression of the nucleic acid in the first subject as 
compared to the control sample indicates the presence of or predisposition to the disease. 

36. A method of producing the polypeptide of claim 1 , the method comprising 
culturing a cell under conditions that lead to expression of the polypeptide, wherein said 
cell comprises a vector comprising an isolated nucleic acid molecule comprising a nucleic 
acid sequence selected from the group consisting of SEQ ID NO:2n-l, wherein n is an 
integer between 1 and 124. 

37. The method of claim 36 wherein the cell is a bacterial cell. 

38. The method of claim 36 wherein the cell is an insect cell. 

39. The method of claim 36 wherein the cell is a yeast cell. 

40. The method of claim 36 wherein the cell is a mammalian cell. 

41. A method of producing the polypeptide of claim 2, the method comprising 
culturing a cell under conditions that lead to expression of the polypeptide, wherein said 
cell comprises a vector comprising an isolated nucleic acid molecule comprising a nucleic 
acid sequence selected from the group consisting of SEQ ID NO:2n-l, wherein n is an 
integer between 1 and 124. 

42. The method of claim 41 wherein the cell is a bacterial cell. 

43. The method of claim 41 wherein the cell is an insect cell. 



583 



WO 03/029424 PCT/US02/31373 

44. The method of claim 41 wherein the cell is a yeast cell. 

45. The method of claim 41 wherein the cell is a mammalian cell. 
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